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FOREWORD 


Professor S. N. Roy’s monograph will be a valuable addition to the other 
important publications on multivariate analysis. The monograph does not attempt 
to cover the entire field of multivariate analysis but it includes a good deal of 
new material which would be of interest to advanced students and research workers. 


I have much pleasure in writing this foreword. More than fifteen years ago, 
when Professor Roy was working in the Indian Statistical Institute, he and I and 
other colleagues had discussed the question of bringing out a series of statistical 
Professor Roy had undertaken at that time to prepare one on multi- 
his voluntary assignment. He has very 
dian Statistical Institute which is thank- 


monographs. 
variate analysis and has now completed 
kindly made over the copyright to the In 


fully accepted by the Institute. 
We should also like to offer our thanks to Messrs. John Wiley & Sons for 
the help we have received from them. 


с 


P. €. Mahalanobis 
27 July 1957. 


PREFACE 


This monograph does not by any means attempt to cover the entire area of 
multivariate analysis, or even a major part of it. Aside from certain basic notions 
and results due to Fisher, Hotelling, Mahalanobis, Karl Pearson, Wilks, Wishart, Yule 
and some of their predecessors, which have now become current coin, this monograph 
is primarily concerned with those developments in multivariate analysis in which the 
author has been specially interested and with which he and some of his collaborators 
have been associated over several years. Part of the material presented here, as 
far as the author is aware, has not been published before, while the rest has been 
collected from papers by various workers in this sector including the author and his 
collaborators. It will be seen that in this monograph the statistical approach to 
different problems and the mathematical treatment of all such problems are uniform 
and perhaps somewhat individual, and that this applies to all specific results, no 
matter whether they are due to the author and his collaborators, or to other workers 
in the field or to both groups simultaneously. 

What has not been discussed in this monograph has been developed and 
adequately handled in important papers by Anderson, Bartlett, Bose, Hsu, Kendall, 
Mahalanobis, Mosteller, Narain, Rao, Votaw, Wald and Brookner, Wilks and several 
other workers. Three excellent books touching upon but not primarily restricted to this 
sector, “Advanced Theory of Statistics" by М. б. Kendall, Vol. 2 [35], “Advanced 
Statistical Methods in Biometric Research” by C. R. Rao[14] and “Mathematical 
Statistics” by S. S. Wilks [28] have, between them, brought together and competently 
presented a substantial part of this material. For an adequate, unified and up-to-date 
presentation of this whole material the author, among others, is looking forward to 
the fortheoming book by T. W. Anderson, supposed to be dealing perhaps more or 
less exclusively with multivariate analysis. 

The preparation of this monograph within a relatively short period, has been 
made possible only through the active co-operation of the entire secretarial staff of the 
department of statistics at Chapel Hill including. in partieular, Mrs. Bonnie Baker 
Fathman, Mrs. Anne Kiley and Mrs. Mary Ann Taylor who did most of the typing 
and of several students of the author including, in particular, K. V. Ramachandran, 
A. E. Sarhan, V. N. Murty, R. Bargmann and R. Gnanadeshikan who rendered indis- 
pensable mechanical and critical help. This job was supported, in part, by the 
United States Air Force through the Office of Scientifie Research of the Air Research 
and Development Command. The printing and publication were kindly undertaken 
by the Indian Statistical Institute and the Eka Press, Caleutta for which mention 
must be made of J. Roy, S. K. Mitra and R. G. Laha and other members of the 
staff of the indian Statistical Institute and the Eka Press for their kind help. To 
‘all these individuals and organizations are due the sincerest thanks of the author. 


The author would be deeply grateful if errors were brought to his notice and 
suggestions were made for improvement in form no less than in content. 
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GENERAL INTRODUCTION 


Multivariate analysis, till now, has been mostly concerned with point esti- 
mation of or testing of hypothesis on parameters or parametric functions occurring 
in one or more multivariate normal populations, the estimation or testing of hypo- 
thesis being of course in terms of random samples drawn from such populations. 
Except for the last chapter of the main body (i.e., Chapter 15) which deals with 
categorical data, this monograph also is concerned with “normal variate” data, but 
here point estimation is not discussed at all; and although testing of hypotheses is 
discussed a good deal, a careful reader will perceive that the main accent is on 
obtaining confidence bounds on certain parametric functions, the testing of hypo- 
theses (in so far as it is developed) being largely a means to that end. The para- 
metric functions that figure in this monograph are, in each case. a set of natural 
measures of departure from the customary null hypothesis, there being, in some 
simple situations, a single such function (or a single measure), and in some more 
complicated situations a set of such functions (or a set of measures). Thus, out of 
the first fourteen chapters of the main body which deal with “normal variate” data, 
the first twelve chapters constitute a conscious attempt to lead up to confidence 
bounds on parametric functions (which, in each case, is a measure or a set of 
measures of deviation from the customary hypothesis), which then are discussed 
in detail in Chapters 13 and 14. 

Tn each case this measure (or set of measures) of deviation from the customary 
hypothesis subsumes, as à special case, the prior notion of a distance function between 
two populations (or dispersion between several populations) which had (i) its begin- 
ning in the coefficient of racial likeness of Karl Pearson (who, however, did not con- 
ceive of a distance function in this connection), (ii) its second stage of development in 
the D® of Mahalanobis (who may have been motivated, among other things, by a 
desire to fuse the notion of the coefficient of racial likeness with the notion of the 
distance function of relativity), (iii) its third stage of development in the distance 
function between two or more general types of populations, as evolved, among others, 
by Bhattacharya who in particular, showed a more statistical slant and (iv) a fourth 
stage of development mostly in the U.S.A. with the same slant which may have 
been independent of but which, in fact, postdates Bhattacharya’s own work. 


However, from the general standpoint of this monograph, the reader will 
notice three large gaps involved in the omission of the important sectors of (a) factor 
analysis, (b) classification problems and (c) the multivariate generalization of vari- 
ance components analysis in univariate analysis of variance. The reasons for the 
omission are the following. Under (a) the author has long been looking for further 
clarification of the issues and then for some means of bringing (a) into the framework 
of confidence bounds on suitable parametrie functions. Under (c) the author has 
been looking for some more clarification even within the univariate set-up, and then 
a suitable multivariate generalization of the univariate set-up and finally a way 
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to bring this into the framework of confidence bounds on proper parametric functions. 
Under (b) the task for the author was one of merely bringing the problem within the 
framework of confidence bounds. It has been only long after the manuscript went 
to the press that all this has been accomplished to the partial satisfaction of the 
author, and all this with such further developments as may occur meanwhile will be 
presented in the next edition of the monograph. 

In Chapter 15 a small beginning has been made in the direction of a certain 
type of non-parametric generalization of “normal variate” analysis of variance and 
multivariate analysis. A lot more has been done in this area since the manuscript 
went to the press and a great deal more remains to be done. The author hopes to 
either incorporate all this in a future edition of this monograph or perhaps present 
it in a separate monograph. Despite all the mathematical elegance and compara- 
tive simplicity of “normal variate” analysis of variance and multivariate analysis, 
one cannot help feeling that the non-parametric approach (whether of this variety 
or of other varieties) is far more realistic and physically meaningful, and is likely, 
in the future, to supplant, to a large extent, the existing techniques of “normal variate” 
analysis of variance and multivariate analysis, including those discussed in the first 
fourteen chapters of this monograph. Nevertheless, it seems that the customary 
“normal variate” techniques and concepts (and perhaps also those discussed in this 
monograph) will long remain a guide and a source of stimulus to non-parametric 
developments, in both their mathematical and their physical aspects—a point which 
may be somewhat overlooked by those who are not thoroughly conversant with all 
the current “normal variate” developments that have occurred over the last forty 


years. 


CHAPTER ONE 


Notation, Preliminaries and General Objectives 


1.1. Notation. As far as possible the following notation and convention will 
be used, all departures being clearly indicated at the proper places. Greek letters 
will stand for population parameters and Italic letters over the first half of the 
alphabet for given (non-stochastic) quantities and over the latter part from, say, r to 
the end for sample quantities. Matrices and vectors under consideration will consist 
of real elements (these will be called real matrices or vectors) except occasionally when 
they might have complex elements (these will be called complex matrices or vectors). 
Capital letters will stand for matrices, small letters for scalars, bold face small letters 
for column vectors and for row vectors if they are primed. ‘The transpose of a matrix 
or a column vector will be denoted by priming such quantities, the conjugate complex 
transpose of a matrix M by M*, the set of characteristic roots of M (if it is square) by 
с(М), its trace by tr M, the modulus of the determinant of such a matrix by |M |, the 
modulus of a scalar m by |m| and the inverse of a matrix M (if it is square and 
non-singular) by M-1. A real square matrix M(pxp) will be called | if it is 
orthogonal, i.e., if MM’ = I(p)( = М' М, necessarily), and if M(pxq) (p < 4) is 
such that MM’ = I(p), then M will be called semi-orthogonal. To indicate the 
structure, a pxq matrix, say M, or a px1 column vector, say m, will sometimes 
be written respectively as M(pxq) or m(p x1). A matrix M whose typical element 
ism; will sometimes be denoted by (mj). The (ij)-th element of a matrix M will 
be denoted by (M); or mj. A diagonal matrix whose diagonal elements are, say, 
ау, dg, ...› Qp, Will be denoted by D,. А diagonal matrix with +1 for its diagonal 
elements will be denoted by D, A(p»p) or sometimes simply A will stand for the 
triangular matrix 


a, 0 0 
pe a ay н o dicen 
Gp, ай» арр 


We have also | A | = it а and it is easy to check that if A is non-singular, then 


4 75 0, and Aa will: diio be a triangular matrix with the same configuration as A, 
б product of two triangular matrices of the same configuration is a triangular 
matrix of the same configuration. A’ is a triangular matrix of the opposite con- 


figuration to А. If A(pxq) = (uj), then dA will stand for tl i даз and if 


j=l i=l 


а'(1хр) = (а, ..., а), then da will denote il da; The Jacobian of the trans- 
il 


formation to an independent set of variables, say, x from any independent set of 
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variables, say, y (with of course the same number of elements as x) will be denoted 
2 à O(u,, ..., Up) Quy =. 
by J(x : y), whil /mbol, AKA sd H (= 2 
y J( y), while a symbol, say, Ж ..., 0,) ap! will have the same well-known 
meaning as in the calculus. The terms "positive definite" and “positive semi- 
definite" will be abbreviated p.d. and p.s.d. respectively. “Almost everywhere", 
that is, “except for a set of (probability) measure zero" will be referred to as a.e. 
As usual, p.d.f. and c.d.f. will stand respectively for the probability density function 
and the cumulative distribution function (of a stochastic variate). 
A stochastic variate 2(—00 < x < oo) will, as usual, be called N(é, о?) if it 
has the p.d.f. 


(1/тА/?л) exp [—(x—E)2/20], ee l2) 


where —oo < č < oo and с > 0. It is well-known that E(x) = č (to be called the 
mean) and Z(z—E)* = о? (to be called the variance). A stochastic vector x(p x1) 
(~œ < а < co) will be called N(E, X) if it has the p.d.f. 


[1/|Z|V*(2]*] exp [—} tr Z3(x—&) (x’—€’)], vee (1.1.8) 


where —00 < ё, < оо and where X is a px p symmetric p.d. matrix. It is also well 


known that 
E(x) = & and E(x—E) (x’—#’) = X. woe LA 


Ё will be called the population mean vector and X the population dispersion matrix 
[see Chapter 3]. 

The symbols е, LJ, N, “A statement <=> another statement", “A state- 
ment —— another statement", will all be taken over from the notation and termi- 
nology of set theory and measure theory and so also w' for the complement of a set 
win a space X. The most powerful critical region of size, say, /5 ( < 1) (which under 
fairly general conditions, will exist and which, under slightly less general conditions, 
will also be unique) of a simple hypothesis Hy against a simple alternative H (such 
that H e Q where О stands for the domain of possible alternatives) will be denoted 
by w(H), H, Py ) and its complement, the acceptance region by w'(H,, H, 2y), to indi- 
cate that, in general, both will depend on fy, Н, and H. The union of regions 
w(Ho, H, Pu) over different H € o will be denoted by Ugo w(Ho, H, £5) or simply 
by Uz, and the intersection of regions w'(Hy, H, y) over Нєо by maw (Ho, H, By) 
ог simply by yw’. P(H,, H, Py) will stand for the power of the most powerful 
critical region of size #„ for Hy against H. ¢, will usually denote the p.d.f, under 
the hypothesis H. 

1.2. Some preliminaries on testing of hypotheses. It is well-known that 
ШН, H, By) and w'(Hs, H, By) are given respectively by 


w(Hy, Н, Вр) :фр 2 Афи» w'(Hy, H, By) : $u < Афи › Ор TUA 
where А is determined by P[xew(H,, H, Bu) | Ho] = fg. 16 сап be shown [43] that 


БН. Н, By) > By: sas) (12) 


з с 
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Proof: Assume that ¢ is such that w defined by (1.2.1) is unique. Integra- 
ting the first inequality of (1.2.1) over w(H,, H, fy) and the second one over w’, we 
have respectively, P(H,, Н, By) > Aflg and 1—P(H,, H, Pg) < A(1 —f 4). from which, 
after a slight reduction, we have (1.2.2). ; 

Note that in general A will be of the form A(H,, H, Ay) depending on all the 
elements. Incidentally, any critical region of size // for H,, whose power with 
respect to an alternative H is greater than or equal to 2, will be called an unbiassed 
critical region for Hy against H. 

Along with the more common terminology, namely the most powerful test of 
Ho against H, a locally most powerful test of H, (in those situations where it is mean- 
ingful), a uniformly most powerful test of H, (if it exists atall with respect to the whole 
relevant class of alternatives) we shall also use the less common terminology, namely, 
an unbiassed test of H, against H, a locally unbiassed test and a uniformly unbiassed 
test. The result (1.2.2) shows that a most powerful test of H, against H, a locally most 
powerful test of H, and a uniformly most powerful test of H, are also respectively an 
unbiassed test of Ho against H, a locally unbiassed test and a uniformly unbiassed 
test. Of course, in general, unbiassed tests will be a much larger class, of which the most 
powerful test will be just a member. 

The likelihood ratio critical region at a level, say æ, of a simple Hy against 
the whole class of simple H € 9, provided that it exists, will be denoted by (ж, H,). 
As is well-known it is given by 


w(Ho, a) : P(x) > ЩН, а)ф (х), vee (1.2.3) 


where for a given x, ¢(x) stands for the largest фу (x) (provided that it exists) with 
respect to variation of H over О, and where (Ho, о) is given by 


P(xew(Hs, «)| Hy) = a. vee (1.2.4) 


Notice that ф(х) is a function of x only, being independent of H e 2, but may depend 
on the total domain О. The power of this test, against any particular alternative 
Нєо, will be denoted by P(H,, H, æy). 

Assume now that Н, is a composite hypothesis and H є © a composite alter- 
native, In earlier papers [40, 41] the author gave a set of sufficient conditions on 
ony for the availability of similar regions for Hy, and a set of (further) restrictions on 
фр and bu, for the availability, among these similar regions, of one which is the most 
powerful for Hy against H in the following sense : Suppose that M and Н are compo- 
site hypotheses, each characterized by some specified and some unspecified elements, 
so that, if the unspecified elements were specified, both Д, and H would be simple 
hypotheses. Now suppose that, among the similar regions for H, there is one whose 
location in the sample space depends on the specified elements of H, and possibly 
on those of H, but not on the unspecified elements of H, or H, but which is 
nevertheless the most powerful critical region for any simple hypothesis within H, 
(obtained by specifying the unspecified elements) against any simple alternative 
within H (obtained by specifying the unspecified elements). But this “most powerful” 
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is *most powerful among similar regions". If we drop the restriction of similarity 
and set up in a straightforward manner the most powerful critical region for the simple 
hypothesis in question against the simple alternative in question, then we may. get 
a (non-similar) region having a larger power than that of the most powerful similar 
critical region just referred to. Such a most powerful critical region may be 
conveniently called a bisimilar region for H, against H. The likelihood ratio critical 
region for composite H, against all composite He® (which we know how to construct, 
provided that it exists), can be shown to be a similar region for H, under the restric- 
tions just referred to. In this situation the same notation will be used as introduced 
in the previous paragraph for the case of a simple hypothesis against simple alter- 
natives, and the result (1.2.4) will also hold, it being noted that, while the regions will 
be independent of the unspecified elements in H, and H, P(H,, H, fz) and P(Ho, H, ар) 
however, might depend on the unspecified elements of H though not usually on those 
of Hy. 


1.3. General objectives. Throughout this monograph we shall restrict our- 
selves to very limited objectives, namely solution of certain non-sequential, i.e., fixed 
sample size two-decision problems, in which, for a preassigned level « or a confidence 
coefficient 1—a, we are interested respectively in obtaining (i) a (similar) region test 
of a composite H, which has some kind of reasonably ‘good’ property against the 
whole class of relevant (composite) alternatives (cQ) or (ii) a set of simultaneous 
confidence bounds on deviations from Но, naturally occurring in the problems to be 
considered (all to be explained later), the confidence bounds, again, having some 
kind of ‘good’ properties in terms of covering ‘wrong’ values of the deviations. The 
scope of the discussion is thus professedly quite narrow and by no means fully 
adequate for the needs of any possible user of statistics, but that is as far as we can 
get at the moment. It is hoped that, in the near future, methods and techniques 
will develop perhaps in extension of those offered here, which can cope with the more 
recondite problems that are of real interest to the possible users of statistics. 


Towards these limited objectives, a heuristic method of test construction will 
be offered which leads to a certain class of tests including in particular, two members 
of special importance to be called respectively type I and type II tests and a genera- 
lization of type I test, to be called an extended type I test. The type II test will be 
identified with the widely known likelihood ratio criterion, but it is the type I and the 
extended type I test that will be used throughout this report, and, in the specific 
situations to be considered, it will be possible, in every case, to obtain, by inversion 
of these tests, suitable confidence bounds on certain deviations or measures of depar- 
ture from the hypothesis that naturally arise in the case considered. As observed 
at the outset, the general method is entirely heuristic and, therefore, the test or the set 
of confidence bounds that emerges as the end product, in any specific problem, has to 
be justified by its operating characteristics in that situation, no ‘good’ properties being 
guaranteed in advance by the general method of test construction itself, 


CHAPTER TWO 
A Heuristic Class of Tests* 


2.1. Definitions and some remarks. Consider, for simplicity but without 


any essential loss of generality (for the definitions could be immediately carried over ` 


into the case of composite hypothesis and alternative), a simple hypothesis Hy against 
a simple alternative H € О. 

(i) Put y = A(He®), and set up as the rejection and acceptance regions 
for Ho, Ugo Нь, H, f) and its complement [дшш (Ho, Н, f), to be called 
respectively Uy and (^g. This is defined to be a type І test for Hy against the whole 
class HeQ, the level of significance a being given by \ 


P(xeU/maow(Ho, Н, |Н) = (Но) (> 2). el (2.1.1) 
Let us for the moment assume non-triviality, that is, that, given a < 1, we can find 
f = fH, а) > 0, for which (2.1.1) will hold. 
(i) Put, in Section 1.2, (Ho, H, 2) = (a preassigned constant) for all HeQ 
and rewrite w(Hy, H, By) and w'(Ho, Н, Pu) as w*(H,, H, и) and w*'(Ho, H, и) 
respectively. 
Now set up, as the rejection and acceptance regions for Hy, U,w*(Ho, H, и) 
and its complement (20% (Ho, H, н), to be called, respectively, Uz and (y, where 


the fs (Hc9) are subject to A(Hy, H, fj) = u (aà preassigned constant). This is 
defined to be a Type TI test for Н, against the whole class He® the level of significance 


a* being given by 
P(xel)n. ow * (To. H, p)| Ho) = a*(Hy, 4). v. (2:1.2) 
Here again let us, for the moment, assume non-triviality, that is, that given «*(< 1), 


we can find a ш such that pHo Н, p) = By C 0) and that (2.1.2) will hold. This 
can be easily recognized as the likelihood ratio test by the following consideration. 


Notice that w*(Hy, H, p) (with a preassigned и) is given by 
w*(Ho, Н, и) : Фи(х) > Mpu (9 fe (271,8) 


Any x would belong to ЈА, Н, p) if for that x, there were at least опе He О for 
which (2.1.3) holds. Tt is easy to see that this would be accomplished if for that x 
the largest ф(х) (under variation of H over 2) were > Hn (X). Hence it is obvious 


that 
Ug * Ul; H, u): $x) > bu X). (ge Ho: H, p) : $6) < Id g X) Е) 


* See reference [43] in this connection. 
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2.2. An obvious property of the two types of tests. Notice that |y includes all 
w(H,, H, В) and Uy, all w*(H,, H, и). Now putting 


; Р(х) |Н) = P(Un H, о) and P(xeUz|H) = Р}, Н, а) 
we shall have from Section (3.1), for the two types of tests, 
ВН, а) = В < PU H, f) < P(Un, H, o) < P(H,, H, о) < 1; 
Р(Н,, H, «) > а m cesn 


B*(H,, Н, а) = fh < Р*(Н,, H, p) < PUn H, o) < P(Ho H, a) <1; 
P(H,, Н, a) > о. s. (2.2.2) 


(2.2.1) and (2.2.2) give respectively, for all He, the lower bounds P(H,, H, f) 
and P*(H,, H, u) for P(Un. Н, о) and Р(( Ју, Н, и) which, however, in general, 
would be far from close except sometimes for large “deviation” from Н. With more 
knowledge of the forms of ¢,, and n. it is often possible to get far closer bounds; 
even the actual powers are often computable without much difficulty (and turn out 
to be pretty high) as for example in most of the classical tests on normal populations, 

It is easy to see that the results of (2.1) and (2.2) could be easily generalized 
to cover the case of composite Hy against composite H є о provided that we have similar 
regions for Hy and a bisimilar region for Hy against H. This, therefore, need not 
be separately treated. 

2.3. Display of two classical tests as type I tests. (i) Almost all classical 
tests on univariate and multivariate normal populations, (ii) most classical tests on 
other types of populations and (iii) many tests on multivariate normal populations 
proposed in recent years are known to be derivable (and indeed many of them have, 
in fact, been derived) from the “likelihood ratio” principle, so that they belong to 
type II. The author finds that all the customary tests in category (i), for example 
the test of significance of (1) a mean; (2) a mean difference, (3) total or partial or 
multiple correlation and (4) regressions, (5) the F-test in analysis of variance, (6) 
the test based on Hotelling’s 7?, all belong to type I as well. Those classical tests 
in category (ii) that the author has examined so far also all belong to type I. Coming 
to those situations that are sought to be handled by tests proposed under category 
(iii), the author finds that the likelihood ratio tests offered so far, while they 
automatically belong to type II, do not belong to type I. On the other hand, if, in 
these situations, one carries out the spirit and method of discriminant analysis, one 
gets tests which belong to type I in a sense slightly more general than we have 
indicated so far. 

In this section we consider, for illustration, two well-known classical tests 
and show that they belong to type I. 

(i) For N(&,o?) and N(é, с?) the classical test of H(E, = č) = Hy against 
H(E, = č) = Н at a level о is based on a critical region given by 


t>t or <—t vee (2.3.1) 


DISPLAY OF TWO CLASSICAL TESTS AS TYPE I TESTS T 


where (= (m+n — 2) (nma (4 т) 3) (04 —1)51 -r(n4—1)82)5, and to is given 
by P(t > t| Ho) —«|2 and where (zm, z); (81 s») stand for the means and 
standard deviations of two random samples of sizes n, and n, drawn from N(5, o?) 
and N(č,, o?), respectively. ‘This is well-known as a likelihood ratio test, but it is 
easily checked as type I as well, in the following way. It is well-known that t > to 
is a one-sided uniformly most powerful (bisimilar) region of size /2 for the composite 
Н, against the composite H(& > ®) = H, and so also is t < —t, for Ho against 
H(é, <>) = Hy; taking the union we have (2.3.1) of size a. 


(ii) Consider the testing of a general linear hypothesis in analysis of vari- 
ance which, as is well-known, can be formally reduced to the following. Suppose we 
have random samples of sizes n;, means z; and standard deviations s; drawn respec- 
tively from N(E,, 0?) (A= 1,..., k), and suppose we want to test H(5, = fm 

k 


= Ej) = Н, against the whole class H of (É,, ..., £j) violating Ho. Put n= У т, ; 
һ=1 


k E 

z= x mn; E= X m. Now the classical F-test for Hy, which is well-known 
һ=1 һ=1 

to be a likelihood ratio or type II test has at a level « the critical region given by 


PAM, 2. (28:2) 
where F — 0-99010-121-0, 


and where Fẹ is given by P(F > Fy| Ho) = &. 

To recognize this as a type I test as well we proceed as follows. It is ob- 
served in earlier papers [40], [21] that among similar regions for Hy (which exist) there 
is a most powerful (bisimilar) region for H, against any specific (Eq, ..., &) = Ё violating 
Hy, the region of size, say, В being given by : 


eR eo (21918) 


where t= 4/n—2 cos 0 , 
ма en 0 = ы®—®)®—®П E Inm mra 3 so 


and where tọ is given by 
P(t > |Н) = B. 


Tt is also noticed in those papers that this # has exactly the usual t-distribution with 
(n—2) degrees of freedom. Notice that # = tọ(n, f) and f = fn, to). To obtain 
now the union of regions: ? > t over different sets of (E, ..., £j) we note that a given 
observation set belongs to the union if for that observation set there is at least one t 
(obtained by varying over £y... Ej) such that t> fy. The union is thus easily checked 
to be given by: the largest t (by varying over Ey ess Eg) > to (which is fixed). But by 
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(2.3.3) the largest ¢ would correspond to the largest value of cos 0, and, given 2,’s and 


8/8, the largest value of cos 0 (under variation over č, ..., £j) is easily seen to be 
given by 
k k 
cos 0 —[ X m(%,—2)?}4/[ S 100—1) Sk + (F,—2)}]#, ... (2.3.4) 
һ=1 hat 


во that the largest / is given by 


k k 
t = (n—2)[ E ny EFE E (sit. ve» (2.8.5) 


Therefore, the union of regions: {> 10, is given exactly by (2.3.2), which is the critical 
region of the F-test. Not'/ce that, given the x of the F-test, F, is obtained from (2.3.2) 
in the form F,(k—1, n—4; æ); and next by identifying the union of regions / > fy, 
with F > F, we have 


to = [(t—1) (n—2) Fn — Ep 
and next from (2.3.3) we have 
B= Bn, to) = A(k—1, n—k; a). 


2.4. Some further remarks on the two types of tests. It may be noted (see 
Section 2.1) that by specializing the £,,’s (the sizes of the most powerful critical regions 
against different alternatives) in two special ways we get in a heuristic manner the 
two types of test. By specializing the Д8 in other ways other heuristic principles 
could be set up, some of which, in special situations, might be “better” than the type 
I or type П tests. It has already been observed that in many situations type I 
and type II tests would coincide. This does not mean, however, that in those situa- 
tions, (Но, H, о) of the type II test would be the Ё of the type I test. Given H, 
and the °з, it would be possible to find a £ for type I and a у for type II such that 
the same critical region for H, against the whole class H e Q could be looked upon 
as (),w(H),.H, В) in relation to the first type and also as Uyw*(Ho, H, и) in relation 
to the second type. 


The following theoretical question or group of questions, now under investi- 
gation, is extremely important. Under what general restrictions on the probability 
law of x and on H, and H € 9 would either or both of the tests be non-trivial (in the 
sense discussed in Section 2.1) and usable (in the sense of having a distribution problem 
amenable to tabulation), and unbiassed (against all relevant alternatives) and/or 
admissible and/or reasonably powerful (in the sense of having not too bad a power 
against all relevant alternatives)? So far as the author is aware, these questions have 
not yet been adequately discussed in a general manner (let alone being answered) 
even for the likelihood ratio or type II test (which has so long been extensively used 
in practice), and no attempt will be made in this monograph to discuss these questions. 


em 
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The advantage, however, of having two such heuristic principles (with the possibility 
of having two different tests in many situations) is that it gives us more elbow room 
than. we would have with one such principle, in the matter of construction of 
non-trivial, usable and "pretty good" tests. 


One remark on the admissibility of a test (in the Neyman-Pearson set-up) 
is especially important. In this set-up suppose we have a hypothesis Hy and а class 
of alternatives Н e 9. Assume, for simplicity of discussion, that H, and each H 
are simple hypotheses. Now suppose that there is any critical region of size, say 
a, for Hy. wW will be said to be inadmissible (or admissible) against the whole class 
Н є Q according аё we can find (or fail to find) another critical region of size a, вау 


w,, such that 
P(x e w,|H) > Р(х є w|H) for all H € 9, 
and Р(х є w,|H) > P(x є wy|H) for at least one H eQ. ... (2.4.1) 


Suppose now that wy is an inadmissible critical region in that we can find a w, satis- 
fying (2.4.1) and, assume for simplicity of discussion that w, itself is admissible. It 
is easy to satisfy oneself that from any physical point of view w, is better than wo. 
Suppose now that w, is another critical region for Hy of size a, which is admissible 
against all H e О. It does not follow from the definition of admissibility that w, will 
necessarily have the property (2.4.1) in relation to wọ On the contrary it may well 


be that 


Р(х є w| H) < Р(х € w,|H) for most H e Q, s (2,4,2) 
and Р(х є w4|H) > Р(х є wy|H) for some H e Q, 
and P(x € wy|H) > Р(х є wy|H) for some H e о. 


A precise definition of ‘most’ need not detain us here. In fact, if a most powerful 
against a specific Н € О is most powerful in the strict sense of 
having a power against H, which is > and not just > that of any other rival, then 
this critical region will be, by definition, an admissible one against the whole class 
of Н e о. But it may have à poor power against most other alternatives. In other 
words, it is easy to convince ourselves that a particular inadmissible region may, from 
any physical point of view, be much better than many admissible regions, although 
there must be at least one admissible test (and usually a whole subclass of such tests) 
which satisfies (2.4.1) with respect to wọ and is thus better than wy from any physical 
point of view. This is a point which is apt to be missed by the statistician, especially 


critical region of Ho 


the theoretical statistician. 


2.5. On the operating characteristics of certain specific tests. Tt turns out that 
in many specific situations (as in the cases to be discussed herein) it is possible to ob- 
tain a class of admissible critical regions for H, against all H € Q, each region having 
a power which is a funetion of certain parameters which are naturally interpreted 


as measures of deviation from Hy. This admissible class may not of course constitute 
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the totality of all admissible critical regions. Now among this class, if there is a sub- 
class which is not only unbiassed against all Heo but is such that the power of 
each is a monotonically increasing function of each of the ‘deviations’, then this sub- 
class is, from any physical point of view, the really valuable subset and will be said to 
be an admissible, unbiassed subset having the monotonicity property. In situations 
where this is available and where all that we know about H is that Heg, the rest of 
the admissible class may, for most purposes, be thrown out. It seems to the author 
that in such situations, this subset or subclass of critical regions is the best that we can 
obtain as a whole and any further attempt at any choice among this subclass, on the 
basis of some stronger optimum property or principle, would be open to controversy 
in that the selection principle would be likely to be artificial and not universally 
convincing. The author is aware of the asymptotic optimum properties of the 
likelihood ratio criterion for simple and composite hypotheses, under certain 
broad restrictions, but there are strong reasons to suppose that these asymptotic 
optimum properties are not peculiar to the likelihood ratio criterion but must be 
shared by a large class of criteria or critical regions. Where H, is composite there 
is the further restriction of similarity which, of course, can be relaxed by just requiring 
that any critical region should have size < z( < 1) under variation of the unspecified 
elements of Hy, in which case the region will be said to be a valid опе, a special case 
of a valid region being a similar region. In any actual situation (usually involving a 
composite Н), if we can find a similar (or valid) critical region which is (i) unbiassed 
against all HeQ, (ii) has the monotonicity or near monotonicity property to be defined 
in Chapter 10 and is also (iii) admissible, then we shall consider this to be a satisfactory 
region and any attempt at getting a region with a stronger optimum property would, 
in most practical situations, be futile for reasons already indicated. However, if, as in 
most of thesituations to be discussed herein, we have a numberof rivalregionsavailable 
satisfying (i)— (iii), then it is no doubt an interesting and useful question as to how the 
powers of the different rivals compare over the whole range of HeQ one rival being 
better than another over some part of the range with a reversal in another part of the 
range and so on. In most of the rather complex situations to be discussed in this 
monograph this would not be possible, because not only are the actual powers not 
available, but we do not even have, at the moment, methods and techniques of com- 
paring powers (in the sense of greater or less) of two rivals without actually obtaining 
the powers. It is hoped that such techniques will be available in the near future, 
It may be noticed here that quite often it is possible to assert properties (i) and 
(ii) and sometimes also (iii) without explicitly obtaining the power functions. It may 
also be observed that among similar (or valid) regions satisfying (i)—(iii) an additional 
consideration for recommendation might be (iv) reasonable simplicity of the null 
distribution problem, i.e., the distribution problem under Hy. If we are also 
interested, as we shall be in all the problems hereafter, in simultaneous confidence 
statements on deviation parameters (or functions thereof), then another additional 
consideration would be (v) the possibility of inverting the test to obtain (without 
running into excessively difficult distribution problems) such simultaneous confidence 
bounds (preferably intervals). 
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Tt will be seen that the tests offered in this monograph are similar region tests 
(in fact, they will be shown to be stronger than that, in a sense to be explained here- 
after) having properties (i), (ii), (iv) and (v). There are strong grounds for believing 
(although we do not yet have a rigorous proof except for the degenerate special cases 
which will be indicated as we get along) that the tests also satisfy (iii), Furthermore, 
the tests that are being offered for the different situations are such that it has been 
possible to obtain for each test a ‘pretty good’ (easily available) lower bound to the 
power function (and consequently a lower bound to the shortness, i.e., the probability 
of covering wrong values of the parameters or parametric functions, of the associated 
set of simultaneous confidence intervals), ‘pretty good’ in the sense that the lower 
bound itself is reasonably large and rapidly goes up as the deviations increase. To 
the tests considered hereafter there are certain rivals (better known but not discussed 
in this monograph for reasons indicated at the; proper places) for which some of the 
above properties are well-known to be true and some of the others are also conjectured 
by the author to be true, but have not yet been proved. 


2.6. Extended type I test. Consider a composite hypothesis H, against a 
set of composite alternatives Heo, (i e continuum). It often happens, as for example 
in the broad situations discussed in Chapter 5, that, while there are similar regions 
for Hy, there is among these no most powerful (bisimilar) region for H, against any 
Hi є continuum), but that we have, instead, the following situation. Suppose we 
have composite hypotheses Ho; (j € continuum) such that (Hy; = Ho and composite 
alternatives H, (i € continuum; j € continuum) such that (,H;; = H; Notice that 
Hy; and H; have more unspecifi d elements than Hy and H, respectively. It may 
well be that we have (as in the cases discussed in Chapter 5) not only similar regions 
for Ho; but also, among these, a most powerful (bisimilar) region for Ноу against any 
Hi (опе for each i with j є cont'nuum; and then є continuum). Consider critical 
regions w(Hy, Hy. д) of size f each. Then by our test procedure, over ();(); of 
w(Ho. Hy, В), which we call ();; for simplicity, we are anyway accepting (Ho. 
that is, Hy and over its complement U; U; (9, Hy, Д) we are rejecting at least 
one Ho; and therefore H, itself. Suppose we set this up as a heuristic test for Ho 
against the whole class HjeQ. Then the critical region wil! be U; U; w(Hoj. Hy, A) 
or U; o° size a, given by 


P(xel;, | Ho) =a, 
so that в = o(H,, B) and f = f(H,, а). .. (2.6.1) 


As before, non-triviality will be assumed, and it s easy to check that, we shall have 
"ог all i and j the following inequality 


B «P Hy. Hy. f) € PU; Hy о) <1. s. (2.6.2) 


Tt may be noted that while А, H; f), a bis'milar region of size f for Ho; against 
H,;, is independent of the unspecified elements of H,; and H;; and while the location 
of Uj; must’ e and its se might be ‘as indeed it is forall the cases considered in Chapter 
5) independent of the unspecified elements of Ho; and Б... the power P(Hy, Н, fj) 
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might involve the unspecified elements of Н, and P(H), H;, а) involve those of H;. 
As observed in Section 2.2, the lower bound to the power of the test, given by (2.6.2), 
while it is in general easily available, is, at the same time, much too crude. With 
more knowledge of the probability law a much closer lower bound can often be found 
as'will be exemplified in later sections. 


The gist of the heuristic union-intersection principle, in its application to the 
two-decision problem of the Neyman-Pearson variety, is this. Suppose we have a 
certain type of hypothesis H, against a certain type of alternative H, such that Ho and 
Н are mutually exclusive sets for which we have an acceptance region w'(Ho, H) and 
a critical region w(H,, H) having some optimum properties and also some mathe- 
matical simplicity. Suppose, furthermore, that there is an Hj formed by the inter- 
section (or union) of Hs of the previous type and H* formed by the union (or inter- 
section) of H’s of the previous type, such that Н and H* are also mutually exclusive 
sets. Then the acceptance region for H5 against H* isgiven by Nas codi. nent” ' (Ho, Н) 

Ho CH}, H* cH Hi CHo, HCH* 


or N w(Hy, Н) or 
Uncut g* cg Ulo: H). Notice that, in particular, Hj may be the same as Hy 


and/or H* may be the same as H. There might be of course other variations on 
this. It is found in many situations, that if the original test has certain optimum 
properties, then the derived test has some reasonably good properties, and no test 
with a strong optimum property of any physically meaningful kind may be’ avail- 
able at all. The same type of heuristic principle can be and has been actually used 
(though not in this monograph) on more general types of decision problems, too, the 
general idea being that if a complex decision problem can be built up of less complex 
decision problems each having a relatively simple decision rule with some optimum 
properties, then a decision rule for the more complex problem can often be built up 
from the (relatively simple) decision rules for the less complex problems. Inimany 
situations this decision rule will have reasonably good properties, and any rule having 
strong optimum properties (of any physically meaningful kind) may not be available 
at all. 


w'(Hy, Н) and the critical region by U 


CHAPTER THREE 


The Multivariate Normal Population 


A univariate no:mal p.d.f. has the fo:m (1.1.2) so that the probability law 


can be rewritten in. the fo m i 


(Полун) exp [-16-962-36—9 |, (3.1) 


where —o < x, Ё < со, € > 0, апа E(x) = Ё, E(r—Ey = V(x) = 0°. By analogy 


let us write down for —00 < (ау, ..., 25) = X (1X) < ©, the probability law 


"le 


k ехр[—} (x —&)B(x—t) ах: es (8.2) 


where =œ < £' «oo, B(px p) is symmetric p.d. and k is a positive constant, and B 
and k have not yet been interpreted in statistical terms. 

To obtain kin terms of B; we use (A.3.9) to put B = TT’ and have (x' -£) B3 
x(x—£) = (x'—£) (Py f-—(x—£E) Now put T-(x—E) = y(px 1), so that (x—£) 
= Ty and J(x : y) =| Ñj = |B|}. Now y has the probability, law - j i 


"kB exp [-ày'y]ay: Doi oboe (9:3) 


во that Yis., Yp are independent N(0, 1) each varying from —co to o0. ,, 
Integrating out over y; we have : 
ву?) = 1 or ke 1m B]* 1, jas (8:0) 


which gives k in terms of B. 
To interpret B statistically we proceed as follows., We first prove that for 
any non-null a'(lx p} (e | mean j 


a/x is N(a’é, a'Ba). T S | 2. (8.5) 


_ Proof:, „a(x E) 1 ату, using the transformation in connection with (3.3). 
Now using (A.3.11) put а' = (a/TI"ay Y(1xp), where 11 = 1. Next, omplete 
I ] r and now make the orthogonal transformation: 

Bp iv р 


дз (ty ot tad pete 


p 
Vx pter 1 Ruin е 
z(pxl)-— mel 1] у(рх 1). 


1(1х р) into ап | matrix 


жазп ата 


n SM RSEN 
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We have thus J(y:z) = 1 and z, = Гу = a'Ty|/(a/Ba)! = a'(x—é)/(a’Ba)!. Going 
back to (4.3) we now have for z the probability law 
1 , 
ny exp [-àz z| dz, .. (3,58) 


'so that 228 are independent N(0,1) and thus z, is N(0,1), and, therefore, a'(x—£) is 
N(0, a’ Ba) and hence a'x is N(a'£, a'Ba), which completes the proof of (4.5). 


Putting a’ = (0, ..., 0, 1,0,..., 0) (0's everywhere else and 1 in the i-th 
place) we check that x; is N(é;,(B);;). This means that any marginal x; is normally 
distributed about E, as the mean value with a variance, say сц = (В) = bi» ay 
(1.271,35, zb), 


We next prove that 


EH LA 1 fa‘ ES 4 
if М 18 of rank 2, then | [2] x(p x 1) has the probability law: 
p p 


gop ere ree i o [5] e] [0]... eo 
where саха) = | [af] Вх) [a i aa, 


p 
and that covariance (a, х,а, х) = a, Bay. 


Proof: [55] (x—£)— [5] Ту, using the transformation in connection with (3.3). 


Now, using (A.3.11), put i] T(pxp) = Ü(2x2) L(2xp) where LL’ = I(2). 


а, 
p 
L 2 
Next, complete 2(2 Хр) into an | matrix A4 now make the orthogonal 
432 p— 
p 
transformation: 
9- EL 
z(px1) = | ] y(px1) We have thus J(y:z)— 1 
p—2 l L 
P 
4]! даах. а; 
апа j ] = L(2xp)y(px1) = es| ‘| Ту = | Jeo. EEI 
2241 a, a, 
1 


Going back to (3.3) we now have for z(p X 1) the same probability law as (3.3), so that 
z/s are independent N(0, 1) and thus [2] has the probability law 


(1/2n) exp [— 4 [a ial MI [3] . ve (8.9) 
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At this point, using (3.7) and noting that 
2 а(х т. [а] йл: 2) [а ў 
J [2] 23-1] wears Due -[z] Акан [22] Віа, i ay); 


so that |Ü| = [| Bla, aj = 10“, 


2 


we have for [35] X the probability law 
Berg 


[1/27)]0 |3] exp[—4(x’—¥’)fa, ay] С [2] (x—8)] d [5] х], 2. (3.9) 


which proves the first part of (3.6). For the second part, we go back to (3.7) and 
observe that 


(v= 018] = ОЈ = [ша „} «m 


Ug Ugg UoZ + Ways 


so that covariance [ау х, а; X] = (uy изг, 4-и) = [нуу мәү] (Since 2, and г, are 

independent N(0, 1)) = (007), - [i8] TP ta, : ads = а! Ва,, from the definition 
2 

of Ü and Ў in terms of B. This proves the second part of (3.6). 


Now taking a’, to be a vector with 1 for its i-th component and 0's for the other 
components and a, a vector with 1 for its j-th component (i 5 j) and 0's for the 
other components we have Cov (2;, 2j) = с; (say) = (В). 


Denoting by У the variance-covariance (or, in other words, the dispersion) 
matrix of the z;'s and taking into account the statement just before (3.6) we thus see 
that X = B, which thus provides the statistical interpretation of B. Now (3.2) can 
be rewritten as 


UO do 


(27p2|x | exp [— 3 (x -£) E (x—£)Vix, s.. (3.102) 


which will be called the p-variate normal distribution. (3.10a) is also otherwise 
expressed as 


x: NE, X), w (8.13) 
and denoting by c; the elements of У we have 
a/(1 x p)x(p x 1) : Ма, a/Ea), s (8112) 
for any non-null a, of which a special case is 


AN og), $—1,2,...,, p, ... (3.13) 
and 


Ца аео [ав (аза ашы, = (B14) 
p 
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for any matrix [| of rank 2, of which a special case is 
2+ 
x). | E! [va оу | AE ER Б 
[e] у |2] fi ee 2 ( ) 
The result (3.15) means in words, that any marginal (x; ж) has the bivariate normal 
distribution about č; and č; as means, with variances d; and озу and а covariance оу. 
There is a more general result than (3.14), namely, that for any A(r хр) (with r < p) 
of rank r, 
© A(rxp) X(px1) : N[A(rxp) &(px1), A(rxp) Xpxp) A(pxr)]. (3.16) 
Denote by ci the ij-th element of X-! and notice that o = ci. 


Now, starting from (3.10) and using (3.12)-(3.16), we have the following conditional 


distributions: 
z, |a : NLE(2, |23), Шоп al imt Aer) 
$ Fal 
where gl! із the 11-th element of Е Sill ‚ and E(x, |x) is a linear function 
Tite 013: Foe 
ОЁ &, (x,—5), 
and 
Ф|, Lgs 2p : NIE (a us es mp). 1/011], vee (3.18) 
Го б zi \ 
where ois the (1,1)-thelement ofthe matrix] .... , and Р(х, |25, ..., p) is a 
(rae rc 
linear function of čys (23—83), .... (x, —5))- 
and also i 
3 РЯ ghi gl2]-i 
235 4g | 25; 2. % > NEB (а Yas ud.) 5 [7s eet | „© (3.19) 


where тїї, o?? and g?? are the 11, 22 and 12 elements of the inverse of the full matrix 2, 
and Ё [es | Т z,) are two linear functions of &, (23—23), ...,(%,—é,) and 
Eas (23—63), ...› (15 — 55). ў 

Taking the customary definition of the correlation coefficient from probabi- 
lity theory as 


one 


Px, 22 = pis = Covariance (жу, X_)/[V(x,) V(va)]5, +. (8:20) 


we define partial correlation coefficient between x, and 2, 25, ..., 2 as 


“Pax, | s. 2р = Ю1э:34---р = covariance (a, 35 | 25; I 255) [ V (23 |23, um) 
X V(arq| 23; - ®„)]} = —0121(011 022) using (3.19). seat (8.2) 


Notice that this is independent of the values of zz, ..., 2». 
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Going back to (3.13) and (3.17) we can, for the normal population, at any rate, 
approach the concept of a correlation coefficient another way and reach the same for- 
mula as (3.20). This is as follows. We have V(z,) = c3, and, from (3.17), V(«,| 2) 
= (031 035—035)/05 which is independent of the value of 2. Therefore, in this case, 
it is easy to show that if we define the correlation coefficient, as we intuitively can, as 


conditional variance of z|% | 
; 


S ENDE: 
Piz [ total variance of x, 


we should have 


" 031—(031 093—915) Fa]? 
Pig. | п—( ат li d = (о/о 023)! = Pia es (3.22) 


It is clear that this approach also to the concept of partial correlation would, in the 
case of a multivariate normal distribution, lead to the same formula as (3.21). 


Using this approach to the correlation coefficient between 2; and (zs; ..., 2), 
we define as multiple correlation coefficient between 2; and (23, 5, ...› 25) 


a i 
ja gee [ es conditional variance of 23 |02, сә) 
total variance of x, 
bin 113r 
ДЕ [5 10 | = [1— 1/6?! сүү}. .. (3.23) 
On 


Notice that this is independent of the values of 2), wg, ..., 2). 


Tt is easy to check that for a multivariate normal distribution x, and 2 are 
independent if and only if pı» = 0, and 2; and (2s, ..., ®„) are independent if and only 
if p.35... = 0, and 2, and x, are conditionally independent |us,..., v, if and only if 
буз р = 0. 

To tie in with the customary definition (3.20) under which—1< fia 1, we allow 
for both positive and negative square roots in the definition of py. and similarly let 
Praga +++ p also take both positive and negative values. But in the definition of 
Pr.33-y We allow only the positive square root, for obvious reasons. 


CHAPTER FOUR 


Random Samples from p-Variate Normal Populations 


If Х(рхт) = (X, Xs ... X,)p, where x,’s (A = 1, ..., m) are an independent 
set and each x, is N(E, X) and p < m, then denoting by рх m)the matrix (£, E, ... hi. 
11 


we have the following probability law for X: 


zm т 


[1/(2л)* |2|? ] exp [-i ir x*x-ox-£) ах. A NAI) 


The elements of X, of course, lie between —оо and co. If now we pass over from 
X(px m) to X4(px m) by a transformation: Х(р хт) = Xy(px m)A(m x m), where А 
і] (non-stochastic), then, by (A.5.4), J(X : X,) = Land we have also XX' = ХХ, 
(since A is |). 


m 
Putting now X'(1x p) = (9, ..., 2), where 2; = X ap/m (i = 1,..., p), it is 
ма 
easy to see that 


X(pxm)£'(m x p) = mx(px UE' (Vx p). E(px m)X'(m x p) = m&(p x Y)X'(Vx p) 
and E(px m)é’ (mx p) = mE(p x 1)¥'(1 xp). vee (4,2) 


Hence 
(X—E)(X’—2’) = ХХ’ т’ —m£X' +-mee’. vee (4.3) 


Tf we now choose the | transformation matrix (from X to X,) such that 


Мт ут... ут Aj же ы 
т 1 sa 41 
du a a Ñ t -[' Мт... у "Jean, mw 


В(т—1хт) 
amy Ama s+ gm 
we have 
Vm 2, 
5 ў үт, ; Я = 
X = ХА = j POX(pxm)B'(mxm—1) SVAR Yip, v (4.5) 
т & р : ад 

(say). Thus 

ХХ’ = Х,Ху= љт НҮҮ", (4.6) 


and hence substituting for XX’, the right hand side of (4.3) becomes 


Y Y'--m х —m XE —m£X' -m&£' or Y Y'--m(x—£)(x' —£). .. (4,7) 
18 
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Remembering now that J(X : X,) = 1 and transforming from X to X, we have for 
X, the probability law: 

т т 
am? |®|*) exp[-1 УУУ т В 0] tvi XMY. .. (4.8) 
Let us put 
EQ By ... 05 
m—lenandX(pxm)e[X..X]p-| | ' `7” 7 „ы А59) 
ae э MM 


and recall that for a sample X(p m), the sample dispersion matrix 5 or (ад) is defined 
by 
n8 = піву) = (Х—Х)(Х'—Х'). es (4.10) 


It is easy to check that 
XY = XX = myx’, or (X—X)(X'—X') = XX'7-mxY' — .. (411) 
so that, using (4.0), (4.10), and (4.11) we have 
YY! = XX'—n3x' = (X—-Xyx'—1). vee (4,12) 
We note that if, as in this case, the elements of X vary from —оо to 00, s0 do those 


of € and У, to make the transformation one to one, Now integrating out (4.8) over 
&(—co to oo) we have for Y the probability law: 


ЖЕ Jexp[—a e z^ rv] av, s (4113) 
and integrating out over Y we have for X the probability law: 
(mE) exp [74 tr Exp рут 0) 


or 


quem x5 exp [^5 СУ we (414) 


which shows that X із N@, a У). 


For the purpose of апу study of the sample or population dispersion matrices, 
we could, without any loss of generality, start right off (as we will quite often do) 
from (4.13) replacing Y by X, but with the understanding that now X is not the 
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original matrix of р хт observations, but is a part of the transformed matrix (con- 
sidered under (4.5)), being px in structure. We shall customarily call this the reduced 
matrix. 


For an X = РЯ А (p--g < m) consisting of m independent ((p+q) х1) 
2 


m 
column vectors, the reduced matrix Y — Esp (р--9 < т) will have the probabi- 
2 
n 
lity law: 
(р+@п E 5 5 ary 
9 2 a 1 “12 1 Ara wee (4.15 
aen 2 x) ex [23 к[ўн $8] и: Y| dYa dYa, .. (4.15) 


EMI 3 „ а : Е : 
where X = [= d is the partitioned population dispersion matrix (symmetric 
"ee TY 


p.d.) for the (p-l-g) normal variates. 


For k random samples of sizes m, from k N(&, E), we have for Y,s and 
z^ (1хр) = (Ein em) (%=1,2,...‚Ё) the joint probability law: 


m k k 
(Jem? 112) exp [- 1 exo (у У mB bE] 
h=1 


heal 


1 k 
ll dY, | | dmz), a. (4.16) 
hol 


hal 


k i k 
where m = Ут. We next put n, = m;—1,n-— X n, = m—k, Е(рхт) — [Y Y s... Yelp; 
h=1 h=1 Ny Mg T 


Ama EX o. Vm Es 
X(pxk) — xc ot des = [Vm X, Мт» Xs... v/m; Xy] 
Vm p --. утру 
Vm by. Vite Ex 
and &(pxk) = S US CHE e = [туй --- Vm Eg. 
Мт бы Vie Ee 


21.17) 
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Using now an | A(kxk) of the structure 


vimm ymm — ... ymm умтт... v mum 
а [^ Тя ог x, (say), ..- (4.18) 
b I B(k— xk) 
йл аы vase ШЫ 


and transforming from X and & to X, and £, such that 


d æ 
X, = XA = [ут : Zp and & = EA'— [ут il Ip, 2s (419) 
s T k-i 1 k-i 
k 
remembering that © УУ, = YY’, and substituting in (4.16), it is easy to check 
һ=1 
| for X(px 1), Z(px Ё—1) and Y(pxn) the following probability law : 
| 
pm m, 
[1)(2m* Ixi*] exp [—} ex 0200—00 02) | 
x dYdZd(y/m Хх). wes (4,20) 
| As before, all elements of (У, Z, X) vary from —оо to со, and now integrating out over 
X, we have for (Y, Z) the joint probability law : 
5 


pin —1) т—1 


Шот) * |2|? Jexp[—3 tr x2(yY4u-ou-o0)paz. .. (4.21) 


Denoting by (5;);; Tins & the dispersion matrix of the №" sample, the mean 
of the A^ sample for the i^ variate and the mean of the 1/^ population for the i^ 
variate (i,j = 1,2,.., p; h=1, 2,,..,k), we note that 


h=1 


ry = [5р]. zz = [3-906929]. 


Zt = | тва), e (4-22) 


А=1 


£Z' — (Zt) and 6 = [26 806—5)], 


һ=1 


where all the elements of the right hand side are either defined explicitly in terms 
of the original set of observations or parameters or are directly calculable in terms of 


t es 
that set. We shall denote | X neg] by S (to be called the sample "within" 
pest 


k 
dispersion тайа), | > m Ea) En |0- by S* (to be called the sample 
DS 


B.C.E.R.T., West à a) 
Date Br lon é | 


Aca None 3... k^ ^ Я 
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n ў » H б - LÀ 
“between” dispersion matriz), [zx mEn—EN En —§) ]—0 by =* (to be called 


the population “between? dispersion matrix) and the vectors X and & defined by 
(4.16) will be called respectively the sample and the population grand mean vector. 


For k = 2 it is easy to check that Z(p x k—1) and €(pxk—1) become respec- 
tively the column vectors, say, z(pX 1) and (рх 1) given by 


ZPX 1) = тх, —3:), tp x 1) = mik), ess (4.28) 
where ту» = Rots р 
T my mos 


and we have for Y(p(»,-4-»,)) and (x,—x;) the probability law : 


эт—41) mot 
[uem 3 1x 3] exp |—4 tr x [ry cmi 6—26) 
X (6..8) | XAF atii x91. sx (4.24) 


For the simple regression set-up corresponding to a one-way classification we have an 
X(p Xn) (with p < n) with a probability law 


[1/(27)72| £ |"2] exp [- tr ®-Ч(Х — E(X)) (X^ — E(X?) 1] dX, 2. (4.25) 


where X is symmetric p.d. and 
where 
Е(Х)(рхт) = &рхт) + ap xg) U(qxn). 2. (4.20) 


Here q < n but might be > p or < p. Without any loss of generality it is assumed 
that č (pxn) = (Ё... £)p, i.e., č is a matrix of unknown parameters consisting of the 
same n columns of unknown parameter vector &£(px 1). (рх) is also an unknown 
parameter matrix, while U is a matrix of rank q of the so-called "concomitant" vari- 
ates, i.e., a set of observations which are supposed to stay constant with the probabi- 
listic set-up of the experiment and the analysis. Again, without any loss of generality, 
but for simplicity of diseussion, we can assume that the row sums of U are zero (for 


each row), i.e., that nu = $ u; = 0 (where u; denotes the j-th column vector of the 
jal 
U matrix. As in (4.1)-(4.13) denote the j-th column vector of the X matrix by x, 
and set X(px 1) = ў x;(px1)/n and use the orthogonal transformation X, — XA 
jel 
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=[Vnx: Ү]р, Uy=UA=[Ynu: V]g —[0: V]1g, б, = 6А = [Vag : 0]p, 
1 n=l 1 m--1 1 1 1 m—1 


n— 
where Ze pvc аш. а 


Мп ањ ... а, 


Пут а Ак. аһ 


is ап | matrix. Since J(X : X,) = J(X : yn X, Y) = 1, we have, as before, the joint 
distribution of Х and Y given by 


[my 2|] exp [— } te xc (uter + oro vj] 
. d(/n BAY. we (4217) 


Integrating out over X from —co to со, we have for Y the probability law 


[1/2712 | 2^] exp [- Ptr УУ —uVy(Y'— vn) | dY. ... (4.88) 


Using the results that (i) tr (АВ) = tr (BA) and (ii) for any square matrix A, 
tr A = tr A’, and recalling that X-! is symmetric, we have 


RI By Vote Y Vip SA te SHY y. vee (4.29) 
Substituting in (4.28) we rewrite (4.28) as 
[12x ^s exp[- $ tr EY Y'—2Y ИЕР АТ) ] dY. ... (4.30) 


Now notice that YY’ = XX'—nx x, VV' = UU'—nuuü' = UU' and 
YV'— XU'—nX п = XU'(since й = 0). At this point let us use (A.3.11) to set 


V(gxn—1) = (уха) L(qxn—1), where L, Li = Ig), — ... (4.31) 


and use (A.1.7.) to complete Z, into an | martix б | 2 m 27 . Next use on (4,30) 


the transformation 


— — pL 
Y(pxn—1) = Y,(pxn—1) Fal xa m .. (439) 
п-1 
or 

Ү(рхв—1) = Y(pxn—!) [L, : L] n—l-[Z, : Zp, вау. 4^ liy 
азаа q n—l—g a A 
[2] P - 
«2S. 
NS. Calcutta of 

no 


мв e 
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Check. that; Ya Y = YY = AA E Z,27. YV = YO — ZU TT = VV, 
recall that J(Y : Z,, Za) = 1, and obtain for 2, and Z, the probability law 


n—i 


(т) [| 7 1 exp[—4 tr £9(Z, Z^ +22, —2Z, Typ! --u Tu )] 
dZ, dZ;. s. (4.33) 


This shows that the joint distribution of Z,(p х9) and Z(p x n—1-—q) is exactly of the 
same form as of Z and Y in (4.21), the m being replaced by n here and the k—1 there 
being replaced by q here. It may be interesting to check again what is implicit in the 
above, namely that the expression under the exponential is expressible very simply 
in terms of the original set-up. Verify that Z,7" = YV' = XU’, TT’ = Vy' = UU', 
Z,Z -YLJLY'—YV'Af-yY'(using(4.31)— YV(VV)3VY' = XU' (UU) UX' 
and Z,Z', = YY'- YD, L,Y’ —-XX'—-nxx —XU'(UU') UY". 


The way to handle more general regression problems which arise from other 
types of designs will be indicated in Chapter 12. 


CHAPTER FIVE 
Statement of the Specific Problems to be discussed 


The problems will be formulated in terms of testing of hypotheses, and, in each 
case, the associated problem in terms of simultaneous confidence interval estimation 
will also be indicated, although the latter will be discussed in full in sections 14.1- 
14.11. For each hypothesis to be considered here, the associated (set of) simultaneous 
confidence bounds will be referred to as A.S.C.B. It will be seen later that corres- 
ponding to each hypotheses and its class of alternatives (to be presently stated) there 
is a ‘natural, and ‘physically meaningful’ set of parameters (or rather functions of 
the primitive population parameters) which can be easily interpreted as measures of 
deviations from the hypotheses. It will be also seen that the tests of hypotheses 
going to be offered here are such that, for each test, it is possible to obtain by inversion 
(and without running into any very difficult distribution problems) a set of simulta- 
neous confidence bounds on these ‘deviations’ from the hypotheses. In this section, 
for most (though not for all ) of the hypotheses stated, the structure of the correspond- 
ing ‘deviations, are also stated without any attempt to show just why and how they 
are ‘appropriate’ or ‘natural’ ; this is done later. The following are the problems: 
(i) For N(£(px 1), X(px p)) (where X is symmetric p.d.), to test Hy : Xi = X, against 
H : У 5 3; the associated simultaneous confidence bounds, as will be seen later, 
will be bounds on characteristic roots of X, i.e., on all.c(Z), or by using (A.2.5), bounds 
on а'(1хр) Х(рхр) а(рх1) (for all arbitrary vectors a'(1x p) of unit length each); 
(ii) for N(Ej(px1), E(pxp)) (^ = 1,2, апа X, and У, are both symmetric p.d.), 
to test Ho ; X, = X, against H: X, 52 X; the A.S.C.B. will be bounds on all c(Z,Xz*), 
or using (A.2.6), on a'(1x p), (px p) a(p x 1)/a'(1x p) (px p) a(px 1) (for all arbi- 
trary non-null a'(1xp)); (iii) for N(£ (p; 1), X(px p)) (r = 1, ..., k; X is symmetric 
p.d.), to test Ho : £j = Ё, =... = Ё, against H: not Hp, i.e., violation of at least one 
equality ; then the A.S.C.B. will be on a'(1x p) (рх k)b(k x 1) (for all arbitrary non- 
null а'(1 xp) and arbitrary b(kx1) of unit length), where y stands for the (px) 
population matrix with b column vectors (each px1) 4/m(E&i—£). vna t), ..., 


3 k 

V/nj(&,—£) and £ = x п, X", Notice that у will be of rank < min (р, k—1); 
h=1 h=1 

(iv) for N[E((p--g) x 1), (p4-q)x(p--q))). where X is symmetric p.d. of the form 


Ea Ze |р à Li У 
| ] (р < 9), to test Hy: Xy(pxq) = 0 against Xy, 5 0; the A.S.C.B. 
Xis Ead 
2 4 
will be on a'(Lxp)X,(px4)Esi(qxq)b(gx1) (for all arbitrary unit-length vectors 
а(1хр) апа b(gx1)) [6]. 

A number of useful problems can be formally tied up with problem (iii), of 
which the more important are the following: (iiia) For Л(Ё,, X) (h = 1, 2), to test 
Hi: č% = E, against Н: Ё, 5 E the A.S.C.B. being now on a'(l»x p)(&;—£s)(p x 1) 
(for all arbitrary non-null a'(1x p)); (iib) for N(£, X), to test Hy :Ё = £y against H : 
Е + E, the A.S.C.B. being on a'(1xp)&(px1) (for all arbitrary non-null a (1 x2) 
(iic) given an observation matrix X(pxn) (р < т) of stochastic variates with 

25 ; Im 
4 
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independent pxl column vectors x, (h = 1,2,..., n) having p.d.f.’s N(H(x,), X), 
where E(X'(nxp)-— A(nxm)Emxp) (m< n; Ё is a matrix of unknown popula- 
tion parameters and A is a non-stochastie matrix of rank r < m < n, whose elements 
are supposed to be given by the particular experimental situation), to test А, : 
С(д хт) E (mx p) = 0, where C is such that H, is testable (see (12.7.5)) against H : 
not Hy; the A.S.C.B. will be given in section 14.6; this Hy is called the general multi- 
variate linear hypothesis which includes the usual problems of multivariate analysis 
of variance and covariance as particular cases and also of course the problem (iii) 


as a very special case; next (iiid) for N(%,, X) (h=1, 2, ..., k), where 
£y 2 A Xy Хр 
= (say) and > is symmetric p.d. of the structure , to 
E» -1d EX Zed 
АЛ 


test Hy : Ey(px1) = Yyo(pq) Х5(0х9) 9х1) (h= 1,2,..., k) against H: not 
Hy; the A.S.C.B. will be given in section (14.7); this Ho is called the hypothesis of (a 
particular kind of) multicollinearity of the means; and finally (iiie) for the linear 
regression model of (4.26) to test Hy: (рх) = 0 (or, say = fo) against H : и 5 0 
(or say flo); the A.S.C.B. will be given in section 14.8. 
Formally tied up with (iv) is the following : (iv) for N(@, X) where X is sym- 
Doug r 


j [ Ea Ny Xu ITE 

metric p.d. of the form У = Xi X Xs |q(p&q) to test Ho: Xia = 0, 
Dis Zas Zos T 

where Yyo.3(p <q) = Xyg(pxq)—X,(pxr) У(Х) Es (rxq) and where X,44(gx q) 

= Xqxq)— "aq xr) Ук х r)E'(rxq); the A.S.C.B. will be on a'(lxp) 

XX;a(pxq) Уәз({хч) b(qx1) (for all arbitrary unit length vectors a'(1x p) and 

b(g x 1)). 

In addition to those considered in the two previous paragraphs there are 
several other problems whose solutions can be formally thrown back upon those of 
(i)-(v) and these need not be discussed or even stated separately here. But even 
within the very restricted set-up (considered in this monograph) of non-sequential, one 
stage, fixed sample-size. two-decision problems of the classical type there are several 
problems of great practical and theoretical interest which have had to be excluded, 
because of the fact that (so far as the author is aware) no suitable and reasonably 
easy techniques are known at the moment. Among such problems (unfortunately 
to be omitted) a particularly important one is the following: for N(é,, X,)(h = 
1, 2,...,4> 2), to test Но: E, = X, =... = X, against H : not Ну; and of course 
the A.S.C.B. on ‘appropriate deviations’ from Hp. 

In what follows chapter 6 will give the derivation of the proposed tests for 
Н, in the situations (i)-(v) and make the formal identification of (iiia)-(iiie) with (iii) 
and of (iva) with (iv), chapters 9—11 will give the operating characteristics of the 
proposed tests, 14 will deal with all the set of simultaneous confidence bounds asso- 
ciated with each test of chapter 6, the operating characteristics of the proposed set 
of simultaneous confidence bounds in each case being easily available from chapters 
9—11. 


———= 


— 


CHAPTER SIX 
Tests for the Null Hypothesis* 


6.1, Direct type I construction not possible. Tt is well known [40, 41] that 
for each composite H, above there are infinitely many similar regions but no most 
powerful (bisimilar) region against any specific composite alternative, i.e., any compo- 
site alternative in which the specifiable elements are given special values. Thus 
direct type I construction will not work here. 

6.2. Reduction to pseudo-univariate and pseudo-bivariate problems. At this 
point suppose that, starting from an x(px 1) which is N(¢, X) we consider a linear 
compound a'x (with an arbitrary constant, i.e., non-stochastie а'(1 xp) of nonzero 
modulus) This a'x is a scalar well known to be N(a'£, a'Za). Notice that a'g and 
a'Xa are also scalars. Suppose also that given 


X, p 212 [in Ye] Pp 
x= i WV , (p <9), 
X q & dq EQ Leet g 
1 р 9 


we consider linear compounds a’x,, b’x,(where a (рх 1) and b(g x 1) are each non-null 
and non-stochastic); then these two scalars a’x, and b'x, are well known to be distri- 
buted as a bivariate normal with a correlation coefficient 


pla, b) = p, = a'X,,b/[(a'9,,a) (b'Z;b)!]. es (6.2.1) 


Now suppose that, іп place of H, of (i)-(iv) of chapter 5 we consider respectively 
(у) H(a'Za =а'Х;а) (= Hoa) against all H(a'£a 7 a'34a) (= H,), (a fixed), 
(vi) Z(a'Z,a = а' Х,а) (= Hoa) against all H(a'Z,a 5 а'У,а) (= H,), (a fixed), 
(уй) H(a'£, a'k, “; a'&) ( Hoa) against all H, (5 Hoa), (a fixed), 
(viii) H(a’ Zib = 0) (= Hos) against all Z(a' Х,,Ь 5 0) (= Hw), (a, b fixed). 
We now consider the totality of all non-null a for (v)-(vii) and all non-null 
a and b for (viii). Notice that (a) (],H(a'Za = a'Z4a) = H(X = Xy, (b)(),H(a'X,a 
= а'®,а) = H(X, = Xy. (c) ГЪЫН (а = a'£j =... = at) = Ht; = =... = 
£) and (d)(),,H(a'Z,4b = 0) = H(X, = 0). We could have worked in terms of any 
subset of a's which led by intersection to the same Hy, but this we do not do here. 
It may be noted that, by the procedure to be used here, apart from set-theoretic diffi- 
culties which, however, do not arise in these applications, the total set of a's or any 
subset of it (of the kind considered) will uniquely define an extended type І test 
associated with the total set or with that particular subset. Next suppose that, in the 
alternative, under (v)-(viii), we substitute "specific" for "all" and thus have four new 
situations (ix)-(xii). It is well known that for each of the situations (ix)-(xii; we have 
one most powerful (bisimilar) region, so that from these we can construct respective 
(modified in a sense to be explained in section 6.3) type I tests for the pseudo-univariate 


* See reference [43] in this connection, 
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situations (v) and (vi), straight type I tests for the pseudo-univariate situat'on (vii) 
and the pseudo-bivariate situation (viii). From these modified type I and straight 
type I tests we can try to construct the respective extended type I tests for the situations 
(i)-(iv). This ties up (see section 4) the p-variate problems (i)-(iii) with the pseudo- 
univariate problems (v)-(vii), the (p--q)-variate problem (iv) with the pseudo-bivariate 
problem (viii). 


6.3. Modified type I tests. We now take over the notation and symbols 
from section 4. 


(v) Starting from (4.13), put y? = na'Sa[a'X,aandnoticethat, at a level f, for 
H(a'Ea = a'Xya)(— Hon) against all H(a/Xa > а'Х а) we have the one-sided uniformly 
most powerful (bisimilar) region: yè >23, (п), and, at a level /,, for Hy, against all 
H(a'Xa-ca'X,a) we have the one-sided uniformly most powerful (bisimilar) region : 
Xa < XP (n) where х8, (n) and x$,(n) are the upper, and lower £, points of the 
x?-distribution with d.fn. Notice that x? has the central y?-distribution with d.f.n. 
Now consider the union [yz > xà, (n)) U Ix? < д (п) = Ula), say. which, if we 
decide to call it a new critical region, will be one of size /,--/, = 2 (say). Notice 
that given f, we can regard 2, and 2, now as flexible, subject to /,--f, = f. At this 
point, we can so choose f, and fy, i.e., the tail ends yj, and xj, as to make ( Ј(а) 
a locally unbiassed (here it will turn out to be also locally most powerful) critical region 
(in the neighbourhood of How). It will be seen that the condition of unbiassedness 
imposes a relation between x2, (т) and y$,(n) which involves only n but isindependent 
of the total size of the region 2. We now call these tail ends дв (p. n) and др (p,n). 
We now recall from (1.2.2) thatthisl J(a) is also a uniformly unbiassed region (having, 
in faet, the stronger property of monotonicity) and is also admissible. With this choice 
of yis (n) and x35 (n) we have now for H(a'Za = а’ а) against all H(a'Ea 5 a'Xya) а 
modified type I critical region of size £. 


Xa = na'Sa[a'E,a > x3g(n) or < xig (n), vs» (0.3.1) 
which is uniformly unbiassed, monotonie and is also admissible. 


(vi) Starting from the product of two distributions like (4.13), put Р, = 
a' S,a/a'S,a and notice. as in the previous case, that, at a level f», for H(a/Z,a—a'X,a) 
(= Но„) against all Z(a' X,a > a’ У, a) wehave the one-sided uniformly most power- 
ful (bisimilar) region : Ё„> Fg (i, na), and, for Ho, against all А(а' Ха < a’ Х,а), 
the one-sided uniformly most powerful region: F, < Рр (т. na), where Pp (т, Ns) 
and F'g,(n,, ng) are the lower 2; and upper 2, points of the F-distribution with d.f. 
nand na Notice that P, has the ordinary F-distribution with d.f. n; and n, Take 
the union of the two regions and as in the previous case call it a new critical region, 
say Ula) of size 2, +f, = f(say). and given f, pick out the tails PF (mj ne) 


and Рр (ту, na) so as to make { Ҳа) a locally unbiassed region (in the neighbourhood 
of Не). notice that this imposes an extra relation between Fg (т, na) and 


Рр, (ny, n) which involves only 7. ng and not the total size of the region J. Recall 
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also from (1.2.2) that this is a uniformly unbiassed region (also having the monotonicity 
property) and also admissible. As before, with this choice of Fp, and Рр to be called 
Рт.) and Pyp(ny, ть) we have now for H(a'2,a = a’, a) against all H(a’ Xa 
+ a’ X,a) a modified type I critical region of size // (uniformly unbiassed, monotonic 
and admissible) 


Р. = a'S,a/a’S,a > Е,в(т, т) or < Fg5(n,, no). .. (0.3.2) 


(vii) Start from (4.16)-(4.22) and recall from (ii) of section 2.3 thatfor H(a'& 
=a't, =... = a'k) (= Hy) against any specific Н.( 5 Hoa), there is the most 
powerful (bisimilar) critical region (of size, say y) which is a one-sided /—region, and 
by taking the union of these regions (for fixed а but by variation over фу, $25 
we have the straight type I region of size, say / given by (notice that. F, has the 
ordinary P-distribution with d.f. », and л) 


F, = a'S*a/a’Sa > Рв(т, ns). .. (6.3.8) 
where F';(n,, na) = Fg (say) is obtained from P(F, > Ра Hoy esas 


This is well known to be a type II or likelihood ratio region as well and is also 
well known to have a number of desirable properties (including uniform unbiassed- 
ness, the stronger property of monotonicity and also admissibility). 


(viii) Start from (4.15) and put 


Su Se Je ifr 
ini [Yi: Yl]. s (6.3.4) 
Si. Soo 49 Y, p q 
pg 
Next put 
ть = a Sy)b/(a 5а) (b' Syb)!, . (6.3.5) 
and notice that, at a level £, for H(a' Xjjb = 0) (=Hogy) against all Н(аЎ > 0) 


we have the one-sided uniformly most powerful (bisimilar) region : fa, > rg(n— 1) 
and for Hoa, against all H(aX,jb < 0) the one-sided uniformly most powerful (bisimilar) 
region : ray < —"g(n— 1), where rg(n—1) (= rg, say) is given by 


P(r, > "g| Hos) = Ё. .. (6.3.6) 


Notice also that this r,, has the distribution of the central correlation coefficient. with. 
d.f. (n —1). Taking the union of the two regions we shall have a straight type I 
critical region of size 2/ given by 


[ray > rg(n—1)] U [ra < —75(—1)]- s. (06.3.7) 


This is well known to be a type II or likelihood ratio region as well and it is also well 
known that this has a number of desirable properties (including uniform unbiassed- 
ness, the stronger property of monotonicity and also admissibility). 
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6.4. Actual construction of extended type I regions. 


(i) By the test procedure (7.3.1) over х?в(п) < XZ < Хв(п) we accept 
H(a'Xa = a'Z,a), so that, by using the heuristic principle of section 2.6 over (acie (n) 
< xè «xih(n) we accept (,H(a'Za = аа) = H(X = Xj = Ho, and thus 
over its complement LJ, [x2 > x3s(2) ог < xig(2)] we reject Hy. This may be set up as 
the extended type I test. To obtain [xfs < X? < N3e(n)] we note that a particular 
S would belong to the intersection if , for that 5, yi; < a'Sa/a/X,a < xg for all non- 
null a. This statement <= yj; < smallest a'Sa/a'X,a < largest a'Sa/a'Zya < Хар, 
the "largest" and "smallest" being under variation of a (for given S). Now, given 
S, and of course X, the largest and smallest values of a^Sa/a'E,a are easily seen from 
(A.2.5) to be the largest and smallest roots, say c, and c, of the p-th degree equation in с: 


exo = 0, 2. (6.4.1) 


all the p roots су, сз, ..., 6, being in this situation, a.e. positive, since X, is given to be 
symmetric p.d. and 5 is, by definition and the assumptions, a.e., p.d. Starting out 
from the (modified) type I test (6.3.1) for H, we have for Hy, i.e., H(X = X,) the 
extended type I critical region 


€, > Xàg(n) and/or e, < yig(n). .. (0.4.2) 


To find x3, and yj; we make use of the condition of local unbiassedness (which in- 
volves only a) (see (v) of 6.3) and also 11.5) and write down the further condition 
(which now completely determines үз, and xig) 


Р(Х S с, € Cy < xig| Ho) = 1—a. s. (6.4.3) 


Notice from (A.7.1.1) that under H, the distribution of с. ..., c, and thus also of c, 
and c, turn out to be independent of X, depending only on p and n thus the c.d.f. (6.4.3) 
depends only on «, p and n, so that it will now be proper to write the tail ends 
as с,„(р. т) and. es (p, n). 
(ii) The general nature of the arguments will be exactly the same as in the 
previous ease. Starting from (6.3.2), over Fy, < Р, < Рв we accept H(a'34a = 


а'®„а), so that, by using the principle of section 2.6 over (^, [ro < RE аск] we 


accept (), H(a’Z,a = a'Z,a) = H(X, = Х,) = Hy, and thus over its complement 


WE Е за > Fog <r Ре reject H,. As before we set it up as the extended type І 


test and, using (A.2.6), notice that the statement 7,5 < a’S,a/a’S,a < Fy, (where Sj, 
S, and Ёү and Р.в are held fixed and a alone is varied) > Fig < c, < c, < Fs, 
where c, and c, are the smallest and largest roots of the p-th degree equation in c: 


|S,—cS,| = 0, 2.7 (6.4.4) 


all the p roots being here, a.e., positive, since 5, and S, are by the conditions of the 
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problem, a.e., p.d. Starting out from the (modified) type I test (6.3.2) for Hy we 
have thus for H(X, = Х,) the extended type I region 


Cp > F'ag(n4, Ne) and jor су < Figli: n). s. (6.4.5) 


As in the previous case, given о, to determine Р.в and Fy, we first take over 
(see (vi) of 6.3 and also 11.1) the relation (involving only n, and 75) between Fog 
and Fy imposed by the condition of local unbiasedness and write down the further 
condition (which now completely determines Fag and Fig) 


P(F,g < сү < Cy < Fogl Ho) = 1—9. i.. (6.4.6) 


Notice from (A.7.2) that under H, the distribution of сп, ..., c, and thus also of c, and 
c, happen to be independent of the common value of >, and X, and also of £,. £y, depend- 
ing only on p, 1, and ng and thus the е... (6.4.6) depends only on а, p. ny and na, 80 
that the tail ends Z5 and Рв can be more appropt iately written as Cya (p. ny, 75) and 


Coq (p, т: na). The actual distribution problem on which depends the evaluation 
of the left side of (6.4.6) is solved in section (А.9.7). 


(iii) Ву the test procedure (6.3.3), over F, = a'S*aja'Sa < Fp (ny, na) we 


accept Н(а'Ё = ... = a'£) (= Hoa), so that using the principle of section 2,6 over 
a'S*a , , 
Ma [r.- a'Sa < F, ] we accept (,H(a'É =... = а&) H(& —& =. &) 


= H, and over its complement UFa > Fp] we reject Ho. We set it up as the exten- 
ded type I test and, using (A.2.6), notice that the statement a’S*a/a’Sa < Fp (where 
S* and S and Fg are held fixed anda alone varied) => c, < F's where c, is the largest 
root of the p-th degree equation in c 


| S* — cS|= 0. s (6.4.7) 


From the definitions and assumptions of section 6.2 and chapter 4 it is easy to check 
that S is, a.e., p.d. while S* is, a.e., at least p.s.d. of rank r = min (p, k—1). It will of 
course be, a.e., p.d. if p < k—1. In any case, we can say that, out of the p roots of 
(6.4.7), p—r will be always zero, while r roots, to be called c, < € € .-- < c, will be, 
a.e., positive where r = min(p,k—1). Starting out from the straight type I test 
(6.3.3) for Hoa we have thus for H(E* — 0) the extended type I region: 


с, > F pln, na)» s. (6.4.8) 
where, given the size æ of (7.4.8), F is to be determined by 
P(c, > Fg|H,) = а. .. (6.4.9) 


Notice from (A.7.5) that under H, the distribution of су, ...,¢, and thus also of c, 
happen to depend only on p, 71 and m, i.e., on p, k—1, n—k (where n is the total 
number of observations and Ё the total number of samples or populations), being 
independent of all other nuisance parameters. Also the e.d.f. (6.4.9) depends only 
on й, p, k—1,n—k. Thus the tail end Fg can now be more appropriately written as 
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calp , k—1, n—k). The actual distribution problem on which the evaluation of the 
left side of (6.4.9) depends is solved in section 7.6 and chapter 8. 


a' Sb)? Ў 
аео) pucr 
we accept H(a’S,,b = 0), so that, using the principle of section 2.6 over May [rs < 
r}(n—1)] we accept fa» (a! 4b = 0) = Н(У„ = 0) = Ho, and over its complement 
Uap (72, > rg (n—1)] we reject Hy. As before, we set this up as an extended type 
T test and, using (A.2.3), notice tha tthe statement (a^ S,,b)?/(a'S,,a)(b'Ss5b) &rg(n—1) 
(where S,,. Sis, S, and rg are held fixed and a, b alone varied) && c, < r$ (n— 1), 
where c, is the largest root of the p th degree equation in c: 


(iv) By the test procedure (6.3.7), over DR 


le, — S, S S^, | = 0. 2s (6.4.10) 


From the definitions and assumptions of chapter 4 and section 6.2 it is easy to see 
that, a.e., S3, is p.d. and so also 8,8718, so that, a.e., all roots will be positive. 
Under these conditions it is known (from (A.1.16)) that the. p roots will all, a.e., lie 
between 0 and 1, satisfying the condition, 0 < e, < c, <... < c, < 1. Starting out 
from the straight type I test (6.3.7) for Hoa we have thus for (У; = 0) the extended 
type I region: 


€ > ra(n—1), 2. (6.4.11) 
where, given that o is the size of (6.4.11), rg is to be determined by _ 
P(c, > r&| Ho) = о. .. (0.4.12) 


Notice from (A.7.3) that under H, the distribution of ву, ..., c, and thus also of c, happen 
to depend only on p, g and л, being independent of all other nuisance parameters. 
Thus the c.d.f. (6.4.12) also depends only on а, р, q, n and hence the tail end r$ can 
be now more appropriately written as c,(p, q, n). The actual distribution on which 
the evaluation of the left side of (6.4.12) depends is solved sin section 7.4 and 
chapter 8. 


CHAPTER SEVEN 


Reduction of Some Distribution Problems and Some 
Actual Distributions* 


7.1. Distribution of rectangular co-ordinates. As in (A.8.6), put X(pxn) 
= Tip x p)L(p x n), subject to LL’ = I(p), observe that T and L; have the distribution 


ALL). 


pn n ELA Ра 
291 (2л) 2 |X|2] exp [—3 tr DAFT) Й i-i ad dL, | БОО 
i= ‘D) |L; 


i-1 


Е Hel) 


Now, using (A.8.6.3) to integrate out over L,, we have the following distribution 
for T [81]: 


m p@—1) n . M ES А 
[ 1? and шы (с) exp [—4 tr Ed YE apad... (7.1.2) 
i=l 


isl 


From (7.1.2), by using (А.6.1.12) and the fact that |nS| = |p|:— fi tj we have 
i=l 


the following distribution for S (usually known as the Wishart distribution) : 


[Se is n etat ig. (7.1.3) 


7.2. Distribution of characteristic roots of the sample dispersion matrix 8. 
Using the results of (4.7.1) we start, without any loss of generality, from the canonical 
form (A.7.1.1), use (A.3.6) to-set X(pxn) = M(pxp)xDj(pxp) L(pxm) where 
LL’ = I(p) and M is |, with a positive first row, take over from (A.6.3.1) the 
Jacobian J(X : M;, c's, Lj) and have for Mj, c's and Lp the distribution: 


рп n-p-1 
9» [чет ñ vi exp [4 tr Dy,MD,M'}EE A de, 
ї=1 i=l 
Si a e e Ma me ауы 7.2.1 
d mod [ П (e | MM) | [Xnm) ` Wa 
AM) (ar, 3029) |р, 


Using (A.8.6.3) to integrate out over Г, we have for c and M, the distribution 


mn 

їр, n) | tem fq! | xot t Dp MDM) — 
— Шу (ry 

| 0(M 5) I 


хой в | ee [2 («о aan] 


"ir 


* See references [24, 31, 32, 54] in this connection. 


93 


36 REDUCTION OF SOME DISTRIBUTION PROBLEMS 


characterized in (A.3.17) and e; = (1—c;)/e. Now take over from (A.6.5.1) the 
Jacobian J(X,, X, : P, U, св, Му, Му, Lor) and obtain for T, U,c's Мур, Mz, and 
Ly, the distribution 


bo Dac | Papa 
pum) $m. ба А y] exp | —1 tr (2 Ее | [ JD | 0 | 
9 i; Lo | I(q—p) 
UD;,,U' m 
x iy [түш th wi d fi ег xc er vede: mod [ I («—e)] 
T" M^ U' is d 


ам f 2 ME UST db] |0008). | maa 
«I i Pd lr, | (2 |@ ш) м, у |7902.) AA VE 


Using (A.8.6.3) to integrate out over M,; and Lr, we have for T: Uu, Myr 
and e's the distribution 


2»* [1/(2m)" 2 ШЇ (1—y,)*] Fip, n—4) Pig, n) 


Din ‚—[Шууп-у ,| 0] jA 
{ UD,,U' UMT 
X exp; —1 tr D jn Din- 10 ы d 
-[ AVL ] | | 1/1-у | Tau qe 
0 Lo | 1@—р)-1- 


' 25 п=р-1— -1 
x [Uju Yt oyi a® П ег 2 de; mod [Tl (ej—e)] dM І 
isl i<j=1 


=1 


д(М,м',) 


rmn AR) 


This is the point to which. in the general case, the reduction of the distribution problem 
сап be conveniently carried out. However, if all e(Zi?X,,E32YX^,) = 0, ie, all угз 
= 0, (which according to (A.1.17) happens if and only if X,, = 0), then further reduc- 
tion is possible and (7.4.2) reduces to 


т(р+ ~ 
splen] 2 Рр, пф) Flan) exp [—4 (tr UD,,U'4- tr FP)]|U "4U 


(MM) ie 


Em rui 


x A ur am fi e aan та, mod [ 1E, ece]atul 
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Note that ir PP — 5, ё 
i>j=1 


and | exp [— $ tr TI I uj íi dij = 2 AN2 qua-1)A í pe са) 
Gus i-i i>jel i=l 2 


and hence integrating out over 7 and Му, obtain for U and e’s the distribution 


nP nipta) 40-107 a 
2 4 ] I r( 


Prts "= Fp, n—a) Fip, a) Fla.) 


i=1 


n rci 


x exp [— } tr UD,,,U'] |U|"-7dU fi e he, mod т UT es (7.4.4) 
i<j= 


Now put UD „тр V, use (A.5.2) and (A.8.7) to integrate out over V and obtain for 
e’s the distribution 
p 2-1 
Const | П e 2-1 de (1 - en | mod [ П | (e;,—6;) 1. (7.4.5) 
i-i i<j=1 
Putting e; = (1—c;)/c; we have for c;'s, on the null hypothesis X, = 0, i.e., 
(Zi XX Dip) = 0. ie, у, = 0, the distribution, j 


Const П (1—с; ea E : dc; mod [ ^" (c;—6;) | 
i=1 i<j=1 
where the 
Const = л??? í mE чш: )/ f тышаш ce pee | 
i=1 


(7.4.6) 


An important special case is when p = 1 and this we shall consider both on the null 
and on the non-null hypothesis. In this case there is only one non-zero (and here 
positive) e or ¢ and only one posible non-zero (and here positive) y. Thus 


3 -1 
mod" («—e)] or mod| Ti (6i—e9] will drop out. On the null hypothesis y = 0, 
i<7=1 ч<т=1 
the distribution of е and с, as special cases of (7.4.5) and (7.4.6), will respectively be 


— 02) _ ein-a-ayinge i.e)" ], we (Тал) 
T(g|2)T (n —d]2) ү ] 


Г(п/2) n—4—2)/20(0-2)/2, j 
(1c) (1-2) /2¢(@-2)/2de |. es (1.4.8) 
T(j/2)'(n—3/2) l | БУ 
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For the distribution on the non-null hypothesis y + 0, we start from (7.4.2), put 
U(1 x 1)— (a sealar), so that UD, ,,U'—(1--e)u?, M (1x q) -m$(1xq)—(m;, Mas». fius) 
(say), and take mg, ..., My to be the so-called independent elements of ms, so that 


(MM, )| = it dm, [2(1— x а), 


dM. 
“| (Map) му 02 ich 


and obtain for e the distribution 


9+1) n—g—2 
3 EIE 


Ma) n 
[uem ? a-y?]ru,^-oFe. me ? de 


1 i 
x [ exp |а iterum th) x ] 
и, T, mbr 


q EV i а ў 
x u™ du Y acia? П "(1 X mè) д w (7.4.9) 
i=l i=2 


i=2 


4 
Now putting m, = cos 0 so that У m? = sin? 0, we note from (A.8.4) that 


i=2 


f exp [22 uty, (1 £m) ] Па] (1-х mi) 


+ 
sing < (5 mi ) < sin 6+d (sin 0) 


ү) ] exp [+4 uta обв 2 (sin 6-240. — ... (7.4.10) 


q— 
2 


= [a= 
Using (7.4.10) and also integrating out’ over tjs (j= 1,2,..., i and i = 2, 
3, ..., р). and setting v? = (1--e)u?, we have (7.4.9) reducing to 


= m 


Const (which is easily obtained) e a ША 14e)? 


x f exp [ = TE (w+ 2] cosh| 22, vty, соз 0/4/14е Jet det, (sin 0y'-*d0, 
v, 11,0 


(7.4.11) 


the limits of v and #,, being from 0 to oo and of 0 from 0 tom. To evaluate the integral 


| [| exp [-a(a2+y2—2hny сов 6)](ay)"de dy (sin 0)"40 
‚8—0 2, y=0 


2, 


=: 
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we proceed as follows 


T л}? 
( exp [2abzy cos 0] (sin 0)"40 = 2 | cosh (2abxy cos 0) віп” 40 
0 0 


ED eu) r (5) (obey). 8, аку); dy AGA EQ) 


where J stands for the Bessel J function in the usual notation [52] (Watson’s Bessel 
functions, p, 79, formula (9)). Thus we have 


m o m 


Integral — г( ака г ер * f exp[—a(a?-|-y?)](xy) ï ? „rl 2abzy)dzdy. 
z,y-0 
(7.4.13) 


To evaluate this put zy = z and z/y = e", so that J(x, y : z, v) = 1/2 and the range of 
z and v could be taken as : 0 < 2 < oo and —со <v < оо. Thus we have, from the 
symmetry of the integrand, 


m 


mo o ae 
Integral = г zJ г mE (ab) d f | exp [—2az cosh v]z * Im/o(2ab2)dz dv. 
с=0 v=) 


(7.4.14) 


But putting v = 0 in formula (9), p. 181, Watson’s Bessel functions, and noting 
that K stands for the Bessel K-function in the usual notation we have 


oo 
( exp [—2az cosh v]dv = K,(2az). ve (7.4.15) 
p=0 
50 mr 
Hence Integral = r( eil ) г( 5) (ab) Ы f Innpl2abz)Ko(2az)z ^ de, ^... (7.4.16) 
X о 


Now putting 4 = 0, у = [2 and А = n—m/2 in (1) of (13.45), p. 410, 
Watson’s Bessel functions and checking up оп the validity conditions indicated 
there, we have 


i Каз Paese | Ut [ее {нт 2 [earn (242) | 
0 


x Fy [oma "ihe es (7411) 
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n 


If now we put c = ib, we shall have J,,,.(2aibz) = OW I,,,(2abz), so that substituting 
in (7.4.16) we should have 


bea ff) (2 ) 2) over) 


ә 
x ,F (^27, EL IS p |. 2. (14.18) 


Substituting in (7.4.11) we have for e the distribution proportional to 


n—4—2 n 
wc qo ДУ 2 lie 
JA i E dejte. vas, (7.4.19) 


Now putting c = 1/(1--е), we have for c the distribution, 


r(3)a-»** m mens ng cus 
iM af, ( ap oe 1 ye\(1—e) 2 c? de. .. (7.4.20) 
reg) ss 


7.5. Distribution of partial canonical correlations. Notice that (A.7.4.5) 


can be rewritten as 


Ip) ; 0 
np qr) j 2 : 
[um E fa- [= 6 Е : [m : | 
0 а 114—2) 


Lp) : —[Юуут=у 9] 


E : | 
«[ je: Ху) UID: _4tr X,X} ( dX ud Xd Xs. 
X, 0 : Е 
0 : I(q—p) 
(7.5.1) 


Now using (A.3.19) make the transformation X,(7xn) = Tr x v) Ly(r x т) 


X,» p| Zu 2 L Kir 

subject to L4 L5 = I(r) and ES where L is a 
X,-24 q L Za Za Ldr 
n j n А 


n—r r 
; L 
completion of L, to make ~ f |. We have, by (A.6.6); 
\ І, 1 


OUS) — 
902,5) | Ler 


HX zT L)-rüq] 
ї=1 
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We have also ХХ; = T 7". Hence from (7.5.1) we have for Z, T, Lj, the 


distribution 
Din 3 —[D,yn-» :9] 


z[uem * ti ay | exp 4 —} tr Bere ; EU 10 ] 
0 : 0 il 


npa n 


Di : —[D,yn-v :0] 


21 A WA 5 А Zu "o: 
x [Zi : 41-2 tr fere : icr : ‘| [Zis : 7%] 
2 T : : 23 


21 H : 
0 : 1. 0 В І 


(LL) 


9(Lsp) 


ptr PP bat, dZa Ay 02 T QdTdL,| 
i=l 


Lor 


Tt is thus seen that (Zi, 231), (212, Z2) and (7, Lyr) are distributed as three inde- 
pendent sets. Therefore integrating out over (P. L,) and (Zis Zæ) and noting 
from (A.3.19) that the c's of this section are exactly the same as ¢[(Z,, 21) (21251) 
(Zo, Za) "(Za Z1)]. it becomes evident that both on the null and on the non-null 
hypothesis the distribution of these c's are exactly the same as the c's in section (7.4) 
with n of that section being replaced here by n—r. 


7.6. Distribution of characteristic roots connected with the multivariate analysis 
ofvariance. Without any loss of generality we start from the canonical form, 
(A.7.5.6) and consider two cases separately, namely where (i) p < n, and (ii) p > n, 
involving respectively, a.e., p non-zero and n, non-zero c's. 


(i) For ease (i) use (A.3.8) to set Xy(pxm) = A(pXp) D (px p) Lpxn) 
and Xy(pxny) = A(p X p)La(pX n), where A, Li, Ly satisfy the conditions of (A.3.8), 
take over from (A.6.2.11) the Jacobian J(X,, X, : A, c's, Lir, Lor) and obtain for 
A, св and Гу and L5; the distribution 


pini +") k 4 
vum] ? ехр[—4{ҥАБы4'+У n-2 (AD clay] 


ny—p—1 


-Fn2—234 Tl 2 do. RIDES 
| x |a tha Й о de; moa "ft. (s оа, | 


a 


NI NI 
001,14) ат, 9015) 4 
90a) |, “| д(1зь) Ly, 

(7.6.1) 


oe 
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Use (A.8.6.3) to integrate out over L,, and obtain for A, c's and L,, the distribution 


p(na +) к, 
21/27] ? Flp, na) exp [—1 { tr AD,,.4’+ ®у,—3 (404,23) 
= i-l 


m—p—l 


А |i 02—243 48 р: y Я а ai 280) bu _ 2 .6.2 
Pe е 2 [леа (в Maud | pr 
Lap) Lay 


This is the point to which, for the general case, the distribution problem can 
be conveniently reduced. If y;s = 0, i.e., all e(Z,Xs!'s = 0 (which by (A.1.13), 
happens if and only if X, = 0, i.e., £ = 0), then further reduction is possible and using 
(A.8.6.3) to integrate)out over Z4; we have for A and c's the distribution 


pin +») 
Pm] 3 Flp, m) F(p, m) exp |=} tr AD ar] aita 


sm p. 2-1 
x Me, 7 de mod | П (а). 233 (7.6.8) 
i<j=1 


Now as in (7.3), integrating out over A we have for сгв, i.e. for e(X, Xi (X_ X$)1)y's 
the following distribution on the null hypothesis £ — 0 : 


o E ee ee) 


i=1 


y зер! mna 2 
x | По ? dej(l+e) ? | mod |. Il (e—6)] <. (1.6.4) 
i=1 ic 


which is exactly the same form as (7.3.6). 


(ii) For ease (ii) use (A.3.14) to set X (p Xm) = p—" ia D, (т Хт) 


Us 
T4 


x L,(n,xn) and X4(pxn) = U(px p)L(pX ng), where U, L4, Da and c satisfy the 


nı 
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conditions of (A.3.14), take over from (A.6.7.8) the Jacobian (Ху, Xs:08, U, Lir Lor)» 
and obtain for C, U, Liz, Ls, the distribution 


plns 3-n2) 
22[1/2л] ^? 


з 8 U, 
exp [73 [$e UD,.(p)U'+ $ y,-2 x Dor 
i=1 i=l U, 


ey 
S 
E 
— 


2-h «x А anc ies an 
x [Utm ^m (0,7 440 Il c, 2 de; mod [її (ee) 
i= i deje 


D i=1 


(eet TAN 
| 9a L^) (Ly Ly) 
x [anu] AL) MH | sn Жек], TOS 


Di 03] 
where Юр) = +10). 
0 02 p—n, 
Ny p—, 


Using (A.8.6.3) to integrate out over Lj, we have for c’s, U, L,, the 
distribution 


gm +») ^ ' U, 
21/27] 2 F(p, m) exp[—4{ tr UD, p)U 4 X 2X [| Joni] | 
i=1 F U, dá 


Mm y "n -1 , 

x|u|" +m—Pqyy g (б 71 7 П eP7 "1 140, mod [1 (2—91, O(L, Li) $ 
i=1 i=1 i<j=1 (Lip) Di 

(7.6.6) 


As in саве (i) we shall stop here so far as the non-central distribution is con- 
cerned. For the central саве, i.e., for the null hypothesis that y;’s=0, i.e., all (Z5 ')'s 
= 0 (which happens if and only if č = 0) further reduction is possible as in case 
(i) and, using (A.8.6.3) to integrate out over L,,, we have for U and c’s the distribution 


pini n3) 
[1/27] ? F(p, m) F(p, n) exp(—} tr РШ”) |U |"! *"*—?qU 


à : Sat. TI 
хп (Ü,g-"-'He ? de, mod [її (i-e) e (7.6.7) 
del il i<j=1 


p-n-1 
i 
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Now putting UD а = V and using (A.8.8.2) we integrate out over V and 
obtain for c's the distribution, 


т"? i г аа yu г( eta 1—i )/ f fir ле r£ pit) ) 


i=) isl i=l 


1 - р—пу—1 Ny n2 
xi rmt ~a e ? defuere) ? ] moa [i i =]. .. (1.6.8) : 
isl i=1 i<j=1 
Another way to handle the distribution problem in this case is not to use 
the transformation (A.3.14) and its Jacobian, but to use the transformation (A.3.15) 
and its Jacobian given by (A.6.4). This gives us for T, св, Lr, L, and Lyr the dis- 


tribution 
n 


Const exp|—} (е лр) BS yi E UI ls] | 


i-1 
pmi ni—l | 
х in motns 2—t qd Ti c; ? mod II (ае) 
Er ist i<j=1 
aly dL dL. 
` ЖЫ) ` [LU TERAN 
901) | in Lao) ри (3an) le 


It should be remembered that L(m,xm,) is | while L,(n,xp) and L,(p Xn) (with 
ny < p < ng) are semi-orthogonal. T and D, ате of course p x p and n, x n, res- 
pectively. Integrating over Г; and Lr by using (A.8.6), and absorbing in the cons- 
tant, we have for c's the distribution 
| ny P-n-l 
Const Пе; ? de;mod ae i —c}) 


i-1 i<j=l 


ny 


x f f =; (0-р) 2 уг y Btu Abs} | 


p ш m 
D nj4ng—i ~ dL. ^ 
x Hl ty ENTUM. 7.6.8.9, 
i 1.17) e 


19011) 


Since L,(n, хр) is semiorthogonal, let us adjoin to it M,(p—n, Хр) so as to make 


"ERE: 

, ie., [L  Mi]p 
M, |p—n. m p—m, 
p E 


orthogonal. ' 
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Now taking into account (A.8.6.11) it is easy to check by putting 
[Li M3] = M(px p) that the integral occurring in (7.6.8.2) can by replaced by 


m—r) (p-rXpr-1) 
2D d d 2 4 
Й (2 E | » 


pum 2 


ny i + 
ху У tye; oj) 
а 


i=l 


x | | exp] — à (tr ÕM D,,,M'T'—2 
7 


My. 
Р ; 4M 
SAL Pr ors rur RU LIMES at НВА SIT GENRES 
li ў, [MM] ' | ) 
| QUE 


where D,,, stands for a px р diagonal matrix with diagonal elements 1+-c,, 1-¢9, ..., 
1+¢n,,1,...,1, and эл, are the elements of M. Using first the transformation їм= 0, 
then U Р тр = V, we observe that (7.6.8.3) now reduces to 


1 m т-+»ӊ—р 
Const [ exp | —4 (tr VY -25 yV Daja +o] 17 dV. 
(7.6.8.4) 
Thus for the distribution of c's we have 
x n 2—m-—1 ni—l 
Const Ile; ? ас; mod П (e;—c;) s.) (7.6.8.5) 
i-1 i<j=1 
nı 
7 n 4-те р 
х f e| -tir Yr 2 У ADB aya] IVI” av, 
y ici Ы 


where the elements of V vary from —оо to оо. For the general, i.e., the non-null hypo- 
thesis we would leave the distribution problem at this. On the null hypothesis, i.e., 
when ys = 0, the integral becomes 


[ e| 21 vv] ivi ay, 
Ys 


which we can evaluate by using (A.8.7) and which we can then absorb into the cons- 
tant, thus reducing (7.6.8.5) to 


p—m—1 mitna 


fT fs ny—1 | 
Const [He 2 dalte) 2 IL NN m- (7.6.8.6) 
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The constant is easily seen to simplify into 


Aarts Er. n mime r m ae Eh 


i=l 


«(erint 


It may be noticed that (7.6.4) which is the distribution of the roots on the null hypo- 
thesis in the case p < n,, ns, goes over into (7.6.8) whichis the distribution of the roots 
on the null hypothesis in the case n, < p < ng, if we make thesubstitution (р, т, na) 
—(n,,p,ny—p--n,) It can be proved by a general reasoning, without working 
through the distributions, that this tie-up will be true both for thetwo null hypothesis 
distributions as well as for the two non-null hypothesis distributions. 


A special case of (ii), namely where n, = 1, is of considerable importance and 
in this case not only the central but also the non-central distribution is easily availa- 
ble. Notice that in this case n, = 1, X,(p Xm) = x(p x 1) (say). рх) = &px1) 
(say), L,(1x1), subject to L, Li = I(1), is equal to + 1 so that Lı; drops 


out and 1 | | 90101) = 1/2 which we absorb in the constant. There is only опе 
| O(Lyp) Ly 
non-zero (and positive) c which is equal to x'(X, X,)x [since, in general, 


n n. 

tr Dy = X c; = tr (X, Xi (X; Х»)-1) = tr Х|(Х„ X5)-1X, and in this case 5 (E 
1 il 

and tr Х!(Х, X5)1X, = x'(X, Ху) 1х] and also one possible non-zero (and positive) 


y which is equal to £' хт Е. Also both in (7.6.6) and (7.6.8) the factor 


ni—1 b 
mod [п (2—0) will drop out. 
` ici 


Substituting in (7.6.8) we have thusin this case, for c, the central distribution, 
[7, 17], 


Е н 


n. 
For the non-central distribution, i.e., when y 52 0, we have, in this case, 5 Ve= 7, 
1 


а A 
; i 3 5! z| Му; = x чи\/су and thus it is easily checked that by 
User eas 
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substituting in (7.6.6) and remembering that L, could take just two values +1, we 
shall have, for U and c, the joint distribution 


Const exp [3 (tr UDj.dp)U'+7F2unVey|U |" T (D gt dU or Me 
i=1 


(7.6.10) 
or Const exp [—4 tr UD, (p)U'] cosh (uj усу) 10|"? 
r-i x a 
x I (Üj)g-1-4U e- de. vs (7.6.11) 
il 
p—1 [U 0; 1+0: 0 1 Vy ГА p 
Now putting UD т: = : ET 
; VEO Т, E pea Чер ОД 
eee AOTT 1 рі 


= V(say), we have for V and c the joint distribution 


Const exp (—} tr VV") cosh (v, 4/yc/1--c)] у |1? fi (Pae t-iay 
int 


(ns +1) 
x e-*de[(1--e) 2 . .. (7.0.12) 


To obtain the distribution of c by integrating over V, we use the same artifice as in 
(A.8.8), change over to a solid matrix W and obtain for c the distribution 


Const 222 mtl " 
Жр 1) e de|(1+-e) | | exp (—3 tr WW’) 
x cosh (way ye]1--c)| W|"**  —4W, — ... (7.6.13) 


where F(p—1, p—1) is given by (A.8.6.3). 


To effect the integration over W, denote the row vectors of W by w; 
@ = 1,2,...› р) and make the transformation 


we]! 18: 
; = е i L(p x p) (where L is arbitrary _| such that the first row vector 


D 


з. à 
i 
- 
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of L is along wi). Thus, though L will depend on w^, the Jacobian of this transfor- 
mation is easily checked to be unity. In this case the new matrix is 


"Mu Wig es Wap Wy Wia + Wp 
E A 3 Jap = = p (say). 
9р Gp ges 
Thus 
WW'- 
en 
i | 


[$* gj 02 li m 


wi Wi (Wy wig" 


g* (wi wi! g*g*'+GG' 
It is easy to see that |W| = | WW |+ = (w; w,)GG'|*. Also tr WW' goes over 
into tr w; Wi+ 5 2-6 GG. 

i-a 


Now by using (A.8.7.1) it is easy to integrate out 
^ ^q f a—p+1 
[ exp [= (3 + te 0601 П |61" P+? 49, 
i=? i= 
and obtain a constant which we absorb into the constant factor and obtain for c the 
distribution 


pu nol 
Const bs 2 de((l4-c) ? ] 


р ED ЗА 677 т. —р+1 
x | expl— bE wh] cosh (у?е +o 500) 
j= j= j 


wij = Ь% esp) e. (7.6.14) 


ашуу. 


pz 
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To evaluate this integral we put X иў, = r? and wy = r cos 0, so that, using (A.8.5.1), 
ј=1 


we have the integral = constant (which is easily obtained) 
œ m[2 n 
x ( f exp[— 172] cosh (r cos Ov/ye/(I-+e 7 dr (sin 0y-? 40. We 
7-0 6-0 
note that the integral over 0 could be taken from 0 to 7/2 and the result multiplied 
by 2. 
Using now the formula (9) p. 79 and formula (2) p.393 of Watson's Bessel Func- 
tions and remembering the relation between Bessel J and Bessel J functions, this inte- 


Mtl .p. 


CELSUS iv ra) ‚ во that the distribution of c comes 


gral reduces to const EN 


out in the form 


r (ane 


r(Z jr (eee 


2 


p—2 Natl ; 
o (-2) [^ 9 ] n P tn erm 


(7.6.15) 


Tt is of interest to note that on the null hypothesis y = 0, the confluent hyper-geo- 
metric function reduces to a constant and (7.6.15) goes over, as it should, into (7.6.9) 
7.7. The distribution of charateristic rooots connected with the multivariate 

regression model of (4.25)-(4.33). Going back to (4.33) and noting the identity of this 
distribution form with that of (4.21), it is easy to check from section (A.7.5) that the 
distribution of the c[Z, Z1 (2, Zs)-!]'s (= сгв say) could not involve as parameters 
obtaining anything except cu P и") X-3ys = en U U'n')Z-!]s (= y;'8, say). The 
problem of the distribution of e[Z, 2102 Z5)?]s can thus be thrown back, where 
p < q, on the case (i) of (7.6) and, when р > q, the case (ii) of (7.6), in both cases 
putting э = qand n, = n—1—q. The complete reduction of the distribution 
problem, i.e., the derivation of the joint distribution of the сг on the null hypothesis 
y; = 0 (i = 1, ..., р) (which, in this case, can happen if and only if н = 0, since U 
is not presumably 0), can be effected in exactly the same manner as for the 
distribution on the null hypothesis in the cases (i) and (ii) of section 7.6. Turning 
now to (Z, Z1) (Ze 25)! and checking with (4.25)-(4.33), wesee that (Z, 21) (Z; Z2)" 
[XU'(UU’)7UX’] х [XX'—nxx'—XU'(UU') UXT. 

An important special case is that of p = 1 which we can handle by putting, 
n case (i) of (7.6), p = 1, n, = q and ng = n—q, Li(pxn) = V(1xq) and A(1 x1) 
a (say), a sealar. 

Substituting in (7.6.2) we have for c (note that there is only one non-zero 
c here) the distribution 


Ш 


n q—2 
201/27]? F(lon—q)e * dc 


x | fo eto det codeystavertyyea ft a0 — 5 wy. 
ЖЫЙ hn by i-a i-2 


(7.7.1) 
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To evaluate the integral proceed as follows 
n/2 


| exp [Ea v/cyl] il dl,;/(1— X RES | cosh (a4/cy cos 0)(віп 9)? d0, 
1. ly ica i-a ae | 


so that substituting in (7.7.1) we have the integral under (7.7.1) reducing to 


= 7/2 
Const | | exp [—} a?(1+c)] cosh (асу cos 0)a"-!da(sin 0)'-*d0. 
4-0 0-0 


Taking account of the discussion after (7.6.14) this integral reduces to 


const F, (s g T He 1225) (1+e)"2, and hence we have for с the distribution 
2 2 p. с 


4—2 n 
qa oP ( = x) [c 3 de/(1+c)? JP, (14 уе), Э) 


T(n[2) 
* 1 pc 


T(g/2) T(n— 


the const factor being easily evaluated (since the constant factor at each stage is 
known and carried over to the next stage). 


If у = 0, this reduces to 


PICS. ees | 
ТОБ | Ree we (1.8) 


For e given by с = e/(1—e) we have on the non-null and the null hypothesis the 


respective distributions 


T(n/2): UU 
d . ee ioe (Le) e *. de. ЕТЮ), 
Dx T(g[2) T(n—q]2) 


7.8. Reduction of the various joint distributions of the characteristic roots to a 
common standard form. On the respective null hypotheses consider the distributions 


A COMMON STANDARD FORM 51 


(7.3.5), (7.4.5), (7.5.4) and (7.5.8) and check that they can all be reduced to the 
following common standard form (expressed in terms of (7.4.5)). 


1 
sll d i 


[rn ft r (mcm) p marin) таат) pf iy] 


8 m m. 8—1 8 
x IL zx;*(1—2;)"* x mod[ П (a,—a,)] II dx; Jem beh) 
il i<j=1 i=1 
where 0 < 2,,...,%, < land, a.e., 0 < 2,,..., 2, — 1. If, however, we order the 2’s 


as 0 a LT <1, or ae,0 <a, «v, <... <, < 1, then the above 
distribution can be rewritten as 


[ae Il г аа П per) грен) i || П (1а) 


i 
il 214i 
8 s 
x II (v;—«) Пад. Е) 
i»j-2 i=1 


8—1 
It is well known that П (ж;—,) can be written in another form, namely, as a Vander- 
i<j 
monde determinant 


д0 il oh 
| 
| at? ap? gian 
(7.8.3) 
| 
| 
Же! 1 1 | 


Denoting now the constant factor (within the square brackets) of (7.8.1) or 
(7.8.2) by k(s, mı, ma). we can rewrite (7.8.1) and (7.8.2) respectively as 


at. ар! 
x Мв, my, m, Ma; (L—2a) demod P 1^ oe m? 2s (7.8.4) 
s! ia 
| 
ET 1 
zl агі | 
8 m. Tm» | P 
and k(s, m, Ma) Пар (1—х) de; . (7.8.5) 
i-i | 
TRY 
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or 
unte cry: gente lg gym 
; CNRC Glew oyu des amete guam | Ай 
k(s, my; ту) | п da; (7.8.6) 
= 
af (1—a,)"* Е)" | 


The following substitutions are to be made in (7.8.1) or (7.8.4) in order to obtain 
(7.3.6), (7.4.6), (7.5.4) and (7.5.8). For (7.3.6) put 5 = ¢;/(1+¢),8 = р, 


m = Ph Png = a ‚ for (7.4.6) put x; = 6, 8 = p. m, = pi 5 


UU xp ‚ for (7.6.4) put v; = с1(14-с,), s= p, m = =— Anc nora 


and for (7.6.8) put ж; = ¢%/(1+¢i), 8 = "4, i = a ЕЕ ‚m= шы > itis of 


interest to note that ordering the хв is exactly equivalent to ordering the сгв of (7.3.6), 
(7.6.4) and (7.6.8); in other words, 0 < 2 € ... STs <1 S04, <... 4 <0 and 
orb a ae ig хс er tse = <¢, <= о: 


CHAPTER EIGHT 
On the C.D.F. of the Largest and/or the Smallest Root 


Tn this chapter starting from (7.8.2) or (7.8.5) we shall obtain the c.d.f. of x, 
i.e., P(z, < 10), where a, is a given constant <1 (from which it is easy to check that 
one can obtain the c.d.f. of a, by merely interchanging m, and ms), and also obtain 
P(x m, v, «o. where д, and з, are also given constants subject to 0 < xo & zy <1. 
Starting from (7.8.5) and putting in (A.9.6.13) m; — m,--i—1 and n = m, (i = 1, 2, 
...; 8) we have 


PUO <2, <... < ty < x) = P(x, < х) = Мв, m, Ms) 
mQ-4-8—1,m, mó4-$—2, m, ... My, My 


хе j 5 MA s j Speers (SNL) 


m+s—1, Mm My+8—2, My .. My, My 
where f is to be successively and completely reduced with the help of the fundamental 
formula (A.9.6.13). 


For the c.d.f of the smallest root xı we note that P(v,«c) = 1—P(x,>2) 
=1—P(r da, <... < % < 1). Going back to the c.d.f. of (21, ... х,) and using the 


transformation 2; = 1—2; (? = 1, 2, ...,¢ в) we have 
Tod 1 5 
8 8 
k(s, ту, ma) | f M | П a7" (1—2;)"* П (e—a), П do; 
i=l i>j=2 i 
€ oi 23-1 
1-= zh 51 g Pru T 
= h(s, тү, ms) | fef Hz"(1—2)^ П (e—a) I dz. v (82) 
A 0 i=1 i>j=2 i=1 


Tt is now easy to see that the integral on the right hand sideʻof (8.2) is exactly the same 
as that on the right hand side of (8.1) with just the interchange of m, and my, which 
shows that the c.d.f. of the smallest root can be thrown back on that of the largest 
root and vice versa. The final reduction of the exact c.d.f. of the largest or the smallest 
root is necessarily lengthy and need not be given here. When m,-+mzy is large, which 
is the case in most practical applications,there is a good approximation in relatively 
far simpler terms (especially when percentage points, up to, say, 10% are needed) 
which will be given in a later monograph. 
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Tn this chapter the final reduction for the exact c.d.f. of the largest root in 
the case of s = 2,3,4 will be given. This is as follows: 


Kor. d 2] 


Р( жуш) = ть a Bole; my4-1, mg--1) (ж; my, т„)-Е2/(ж; 2m4 4-1, 2mg-1)]. 
(8.3) 


For = 3, 


P(z Sx) = T ыа 2Bla; Эт,-Е3, 2mg4- 1) (ж; ту, m3) —2f(; 2m, +2, 2т4-1) 


lo (ат ЁЗ, ты +1) Pty «xh o e (84) 


XO que To) илт, 


For 8 = 4, 
(4. m4, ms) 5 1 
Pi EPIRI Mt: x; m +3, ms Р(х < 2 
(<) ат) Ваа my +3, ma- Dia, my, ma) < 2) 
c 1 2 
-++2/(ж; 2mi4-5, ать) v. m m Ыы Оа WE B(x; 2m44-3, 2m3-- 1) 


{fale ту-Е2, ma--1) f(v; my+1, ma) -2(v; 2m, 4-3, 2mg4- » 


een 2m--4, 2m, 2) | 8; m44-2, ть4-1) (ж; m, ma) 
PP уа o ERA 2m +2, 2m,+1)}} we (85) 


Again starting from (7.8.5) and putting in (A.9.7.2) m, = m+i—1 and 
n = т» (i = 1, 2, ..., в) we have 


Р(ж «v, X... €x, Xx) = P(e Kay « v, X v) 


m4,+s—1, mg ma44-85—2, m, ... Mi, My 

m+s—1, my т4-8—2, m, ... My, My 
= k(s, m4, m3) | T, 20; 

m+s—1, m, m4--8—2, my ... My, My 


(8.6) 


where £ is to be successively and completely reduced with the help of the fundamen- 
tal formula (A.9.7.2). Below is given the complete reduction of the left side of (8.6) 
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for s = 2, 3, and 4, the left side of (8.6) being conveniently denoted by Pa P; Ps, 
etc. 


2, m4, my) |. 
P, = meen [2B (a, xy; 2m, +1, 2mg4- 1) — (а, xy; m3, ma) (foc; m+ 1, ma-1) 
-Е (жу; ma 4-1, mg--1)]. ERE BT) 


mecnm) [2/(v, ay; mi, ma) (ж, 20; 2m44-3. 2m44-1) 


$7 (m4 ms4-3) 

— 2B(x, xy; my +1, ma) B(x, xy; 2m4--2, 2mg--1) 

yb hue (f (5 34-2, ma-- 1) — (y; m44-2, ma--1)7] .. (8.8) 
K(2, m4, т») А Or Mo H eR Caef É 


un 


k(4, m, ma) [2 
2f(z, xy; 2m4-2-5, 2m,+ 1)—__ 3 __ 
ST a Gens ae "ns; My; Mg) 


57 (yb ty +4) 


Ba те о, 
18, m, m) {Boas m +3, mat V) -- foto 931-3, Ma+1)} 


2f(x, xy; 2m44-3, 2m34--1 55 р 
P Ig) atl) {— polz; my 4-2, ma4- 1) B(x, а; m+ 1, mg) 


— fy; m 4-2, ma- 1) B(x, д; ma 4- 1, ть) 2006, жу; 2m, 4-3, 273-1) 


+ 


2/ff(v, Xo; 2т--4, 2m FIS оо ; у; 
Ke Tub ER P8) { Paz; m3 4-2, та A) (ae, ж; m3, ma) 


s 44-2 
— (жу; my - 2, mg4- V) Bla, а; m4, ma) d- 2v, а; 2m, +2, 2mg-- 1)4- а | 
(8.9) 


For larger values of в the reduction of P, to exact expressions like those given by the 
right side of (8.7)-(8.9) will no doubt be increasingly lengthy but the remarks made 
after (8.2) will apply to this situation as well, so that if 20, әп, is reasonably large, as 
would be the case in most practical situations, we can use much shorter expressions 


as good approximations. 


CHAPTER NINE 


Operating Characteristics and Lower Bounds on the 
Power Functions of the Test Regions* 


9.1. The operating characteristics of the test regions. As of the moment the 
exact (small sample) power functions of the regions (6.4.2), (6.4.5), (6.4.8) and (6.4.11) 
seem to be, in the general cases, quite intractable. At any rate, so far as the author 
is aware, no method is known at the moment by which the requisite distribution 
problems could be solved and the final ¢.d.f.’s be given, except in very symbolic 
(and, for practical purposes, quite useless) forms. However, it is possible even with- 
out exact expressions for c.d.f.'s, to obtain a number of useful semi-qualitative and 
semi-quantitative properties of the power functions, which, as will be presently seen, 
are about all that would really matter for most practical purposes. 


We observe from (A.7.1), (A.7.2), (A.7.5) and (A.7.3) respectively that the 
powers of the critical regions (6.4.2), (6.4.5), (6.4.8) and (6.4.11) depend only on the 
corresponding sets of populations roots, namely (> Уу (to be called y’s) for the 
first case, e(X, Ez. )s (to be called y’s ) for the second case, c(2*D~*)’s (to be called y’s) for 
the third case and Cope Ia Уз Die)’s (to be called y’s) for the fourth case. For 


convenience we write down the respective powers for the four cases as 


Pic, > cs (p, n) and/or с, < cial P, n) | Н] = Pla, p. ni Vas Ya -> Yp) m ОЛОТ). 
P[e, > Ca (p. M15 n3) and/or c, <¢,,(p: m4, %2)| Н] = P(x, p, n3, ai Yis Yos -+> Ур)» (9.1.2) 
Ple, > с,(р, k—1, n—k)| H] = Pla, p, k—1, т; Yis Уз :--› Yr) and speak UE) 
P[e, > с,(р, 4, )| Н] = P(e, p. q, ®; У Yas +1 Yo): es (9.1.4) 


Notice that, depending on the rank of E* and Ул, some of the y’s of (9.1.3) and (9.1.4) 
might be zero but the most general ease will be one in which as many as are set down 
will be positive. Notice also that in (9.1.3), r = min (p. 2—1). Recall now from 
(A.3.3) that for (9.1.1). 0 «all y's «oo, from (A.1.9) that for (9.1.2) and (9.1.3), 0 < all 
y's < co and from (A.1.14) that for (9.1.4) 0 < all y's < 1. 


With this introduction we shall consider the power functions (9.1.1) and(9.1.2) 
for the problems of one dispersion matrix and two dispersion matrices and the power 
function (9.1.3) for the problem of multivariate analysis of variance and the power 
function (9.1.4) for the problem of independence between two sets of variates. In 
section 9.2, to each power function a lower bound will beobtained which will be called 
a lower bound function and which will be seen to involve (aside from the degrees of 
freedom) just those deviation parameters that occur in the power function itself. In 
Chapter 10 it will be shown that for each power function the lower bound function 
monotonically increases as each deviation parameter separately tends away from its 


* See reference [43] in this connection. 
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value under the null hypothesis. Although, under the null hypothesis, the lower 
bound function does not assume the value « which is the significance level of the test, 
this value is attained soon enough under deviations from the hypothesis. Thus the 
power function stays greater than a monotonically increasing function of each devia- 
tion parameter and is also shown to be unbiassed against all deviations from the hy- 
pothesis for which the lower bound function is greater than or equal to the size % 
of the test. In chapter 11, for each of the power functions (9.1.3) and (9.1.4) another 
such monotonic lower bound function is obtained which is believed to be closer than 
the lower bound functions of section 9.2; also for each of the power functions (9.1.1) 
and (9.1.2) some near monotonie properties are proved. 

9.2. Lower bounds on the power functions. The lower bounds are obtained 
in three different stages to be called (9.2a), (9.2b) and (9.2c). 

9.2a. Reduction to canonical forms. Without any loss of generality we can, 
for the case of (6.4.2), start right from the canonical form (A.7.1.1); for the case of 
(6.4.5) from the canonical form (A.7.2.1); for the case of (6.4.8) from the canonical 
form (A.7.5.6); and for the case of (6:4.11) from the canonical form (A.7.3.5). for 
the case of (6.4.2) there is an additional point to be noted. Putting together (A.7.1) 
and (6.4.1) it is clear that it will be appropriate, instead of using as we did in (A.7.1) 
the transformation 


Х(рхт) = щрхр) Y(p xn), where X = ири 
(y's being the roots of >), to use the transformation X(p xn) 
= щрхр)А(рхр) Y(p xn), where X, = A, A, and Ag1X(Ao)7* = wD ph’, 


y's being the roots of ДУД), i.e., of (А, Ao), i.e., of EX. Under this trans- 
formation the y's of the canonical form (A.7.1.1) will really be the roots of X Xj! and 
the roots of (Y Y^)/n will really be the roots of the equation (6.4.1) and thus we have 
an exact tie-up with the problem involving the power function of (6.4.2). 

9.2b. The inclusion within the test regions (6.4.2), (6.4.5), (6.4.8) and (6.4.11) 
of regions having simpler probability measures (under the respective non-null. hypotheses). 

(i) We recall from (6.4) and the canonical form (A.7.1.1) that the test region 
(6.4.2) is really’ U,[a' Y Y'a|na'a2 cep» т) or < с,,(р, n)], where Y has the distri- 
bution (A.7.1.1). We also notice from the canonical form (A.7.1.1) that the p functions 
al Y Ү'а,/улаа; (= 1, 2,....р) (with а, being а 1хр row vector having 1 for 
the i-th element and 0 for all other elements) are distributed as p independent X?'s 
with d.f. n each. Putting these two facts together we have that the test region 


(6.4.2) includes the union of p regions, each composed of the tail ends of a central 


x2-region, all the p X?s being independent. 
(i) We recall from (6.4) and the canonical form (A.7.2.1) that the test region 
(6.4.5) is really U [na Y,Yj;a[na' Y, Y;a > Са, ai; na) Or < CialP, M1» ®%›)], where 
Y,Y,have the distribution (A.7.2.1). We also notice from the canonical form (A.7.3:1) 
that the p functions ma, Y, Y ja; vins; Y, Y;a; (= 1, 2, .....) (with a being a 1x p 
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row vector having 1 for the i-th element and 0 for all other elements) are distributed 
ав p independent F's with d.f. n, and n, each. Putting these facts together it is easy 
to see that the test region (6.4.5) includes the union of p regions, each composed of the 
tail ends of a central F'-region, all the p Fs being independent. 


(iii) From the canonical form (A.7.5.6) and from (6.4) we notice that the test 
region (6.4.8) will really be U, [(n—k)a' Y * Y *'a/(k— 1)a' YY'a2c(p, k—1, n—k)] 
where Y* and Y have the distribution (A.7.5.6). We also notice from the canonical 
form (A.7.4.5) that the p-functions (n—k)a; Y *Y*'a;/(—1)a; Y E'a; (= 2: 750) 
(with a; being a 1хр row vector having 1 for the i-th element and 0 for all other 
elements) are distributed as p independent F’s out of which at least p—r are central 
and at the most 7 are non-central with non-centrality parameters Cr Yi Yr) (notice 
that if s<r = min(p, k—1), then, out of these y’s, s will be positive and the rest, 
i.e., r—s will be zero). Putting these together we observe that the test region (6.4.8) 
includes the union of p regions, out of which at least p—r are central F-regions and at 
the most ғ are non-central F-regions with non-central parameters y;'s (i = 1, 2,..., r), 
all F's being independent and each being based on d.f. ^ and ng. 


(iv) We notice from the canonical form (A.7.3.5) and from (6.4) that the test 
region (6.4.11) willreally be U, ,[(a^ Y, Y2b)?/(a' Y, Yia)(b’ Y Yb) > с, (р. 9, n)], where 
Y, and У, have the distribution (4.7.3.5). We notice further that there are p fune- 
tions (a;Y,Y;bj?/(aj Y У;а)(ЫУ, УЬ) (i = 1, 2, ..., p) (with a;(1x p) being a row 
vector with 1 for the i-th element and 0 for all other elements and b;(1 xg) being arow 
vector with 1 for the i-th element and 0 for all other elements) which are distributed 
as the squares of p independent correlation coefficients (some of them central and some 
non-central, the respective non-centrality or deviation parameters being y; (notice 
that out of these p y;'s, t are positive and the rest, i.e., p—tare zero, where t < p<q 
is the rank of X,, Ej; Xi», i.e., of X,, and all lie between 0 and 1; the positive y's 
can be conveniently arranged as 0<y,<...<y,<]). 


Putting these together it is easy to check that the test region (6.4.11) includes 
the union of p regions out of which (p—t) are central correlation (square) regions апа? 
are non-central ones, all being independent and each based on d.f. (n—1). When 
q>-p it is possible to improve on this in the following manner. Pick out linked a; 
and b; (i = 1, 2, ..., p—1) and at the last stage an a7, with a set of b,’s (i = p, p+l, 
= q) Such that there are p independently distributed correlation squares, of which 
(p—1) are total correlation (squares) and the last one is a multiple correlation (square) 
between the p-th variate of the Y,-set and the (p, p--1. ..., g) variates of the Y,-set, 
The deviation parameters being y;'s (0<y,<...<y,<1), we could so arrange that the 
first p—t sample (total) correlation (squares) had zero deviation parameters to go with, 
the next #—1 sample (total) correlation (squares) had respective (one each) deviation 
parameters (y,.75...., у, 1) to go with, and the last sample (multiple) correlation 
(square) had y, to go with. Notice that the distributions of the square of a 
correlation (central and non-central) are easily available from those of the central 
and non-central multiple correlation coefficient (see (7.4.9) and (7.4.20)) by putting 
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9.2c. Actual construction of the lower bounds. From (9.2b) it is now easy to 
write down the lower bounds of the power functions (9.1.1)-(9.1.4) of (6.4.2), (6.4.5), 
(6.4.8) and (6.4.11) as follows: 


D 
Р(а, p, m Уз, Уз, s Үр) > 1— itia 2 Coal P: n)y; or & Cyalp, »)/y)] 


(9.2.1) 
(each x? being based on d.f. n), 
р 
Pla, Р, Nr; %›; Үр Yas es Yp) SAS ыд > Cos D Nis na)lys 
i= 
or X ey (p, т, n3)[y;))], (each F being based-on d.f. my and na), ... (9.2.2) 


Pla, p, k—1, n—k; Yis Уг, «++ Ya) > 1—[1—P(central Ре, (р, k—1, n—k)p-* 


x П [1—Р (non-central F > c, (p, k—1, n—k)|y;)], ... (9.2.8) 


i=l 
(each F being based on d.f. k—1 and n—k), and finally 


Р(а, p. 9, т; Vas Ув, Y) > 1—[1—P(7? > с„(р, q, т) | null hypothesis]? 


x П [1—P(r* > clp, g, 2) |p? = У), 2. (9.2.4) 


ie 
(each 7? being based on d.f. (n—1). 


If p<q. it is easy to check from (9.2b) that this lower bound could be 
improved by the following 


Р(о, p, 4. т; Үз. y) > A—[1— P(r? > calp, д, т) |null hypothesis) 


t-l 
x II [1— Prt de,(p, q »)|pl = volt PE > р, n)| o? = vol... (9.2.5) 
where all the factors except the last are on squares of (total) correlations distributed 
with d.f. n—1, while the last is on the square of a multiple correlation, distributed with 
n—1 and q— p d.f. and also where, out of the p— 1 total correlations, p—t are central, 
11 are non-central with non-centrality parameters 7y,,....y,4 and the multiple 
correlation is non-central with the non-centrality parameter y; 


'To eompute in any situation the right sides of (9.2.1)-(9.2.5) we observe that 
aside from the central, i.e., ordinary, д? and F and the total correlation (squared )-distri- 
butions (the last one being transformable to an Ё- distribution), which are all well 
known and have their percentage points tabulated, we need, in addition, tables for the 
c.d.f. of non-central F and non-central multiple correlation (connected with the multi- 
variate normal population). These tables are available in part [11, 22, ‚ 50] and could 
be easily extended with modern computing facilities, 
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It may be noted that if in (9.2.3) we put k= 2, i.e., в = 0 or 1, then each 
side of (9.2.3) is computationally accessible, the left side being the power function 
of Hotelling’s 72, while the right side is also easily available (in this as in all other 
cases). 


It is of considerable importance at this stage to ask how “good” the lower 
bounds indicated in (9.2.1), (9.2.2) and (9.2.3) or (9.2.4) are. A lower bound to the power 
could be said to be “good” if it were (i) close to the actual power, and/or (ii) if it were 
itself pretty large, being greater than the level of significance « for reasonably large 
values of the deviation parameters and possibly getting larger as those parameters 
increase. For all the three tests condition (ii) has been numerically checked to be 
true over a fairly wide range of test values of the several parameters involved. With 
regard to condition (i), in general, that is, for small samples, not only do we not know 
the actual power (in which ¢ase the search for a lower bound would have been redun- 
dant) but at the moment we do not even know an upper bound on the expression: 
(actual power—given lower bound to it)--actual power. In large samples, however, 
the situation improves and it turns out that the relative error is "small", so that 
the given lower bounds are "good" also in the sense (i). 


CHAPTER TEN 


The Monotonic Character of the Lower Bounds 


on the Power Functions 


10.1. The problem of one dispersion matrix. For convenience we rewrite 
(9.2.1) as 


Calp: 0) суа < SaN (10.1.1) 


p 
Оа HE = i- fn Vi Yi 


each y? being based on d.f. n. 

Now denoting, for shortness, the factors in the product on the right side of 
(ТОБУ Pss, Pa We shall show that 0P,/0 0) is positive or negative according 
as y,is > or < 1, or in other words, Р, ОКА y; tends away from 1, provided 
that c,, and cy, are so chosen that 


СОЕ 


i 


Proof: Aside from a constant and positive factor of proportionality which is 


free from y;, we have 


Coal Yi ies à 
P= | e$» * 400), e. (10.1.2) 
Cial Yi 
and thus 
n—2 n—2 
OPE вена) Fog, Len epa] F a. 2. (19.1.9) 
a a Yi Yi 
Yi 
i 
n n 
where e nho, у gg P) = 0: 


It is easy to check from (10.1.3) that ә) is positive if y;>1 and negative if 


y; < 1 and also that P,—0 as у; 00 or 0. 


61 


62 THE MONOTONIC CHARACTER OF THE LOWER BOUNDS 


Thus the right side of (10.1.1) monotonically increases as each y;, separately, 
tends away from unity and the left side, which is the power function of the test, 
always stays greater than this monotonic function. Furthermore although at Hp, i.e.. 
when all ys = 1, this monotonic function is <a, it becomes greater than ог equal to 
æ for all уг statisfying 


fl P(A < yt < %) <1—а. 2. (10.1.4) 


This means that the test itself is unbiased at least against all alternatives УВ 
satisfying (10.1.4). 


10.2. The problem of two dispersion matrices. As in the previous case we 
rewrite (9.2) as 


р 1 4 
Р(«, p» т.т; Yas Yas s Vp) > 1— H р(242. 20) <Е < парты m 


i-1 


(10.2.1) 


each F being based on d.f. л, and ny. Now denoting, for shortness, the factors in. 
the product on the right side of (10.2.1) by P4, Р„,..., Р, we can show exactly as in the 
previous case that 0P; lc ) is positive or negative according as y; > or < 1, provided 
i 
that c,, and c5, are so chosen that [apa MI — 0. It is also easy to check that 
Ye d у= 

Р,» 0 as y, © ог —0. Thus, as before, the right side of (10.2.1) monotonically 
increases to 1 as each y;, separately, tends away from unity and the expression to 
the left side of (10.2.1) which is the power function of the test always stays greater 
than this monotonic function. As in the previous case it follows here also that the test 
itself is unbiassed at least against all alternative уз satisfying 


fip{%<Fr< аа 0, 2. (10.2.2) 


10.3. The problem of multivariate analysis of variance. We rewrite (9.2.3) 
as 


P(a, p, k—1, n—k; у, Yas ...› Ye) >1—[P(central F<c,(p, k—1, n—k))P* 


x П P (non-central F < с, (р. k—1, n—k)|y;). ris A RRA) 


i=1 


each F being based on d.f. k—1 and n—k and s = min (p, k—1). Itis well known that 
P(non-central F < c,|y) is a monotonically decreasing function of |4/y|, which has 
been and can bẹ proved in various ways, perhaps the simplest proof being the following. 
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It is well known that with a canonical p.d.f.. we can, except for a constant and posi- 
tive factor of proportionality not involving y, write 


ki nok k-i nak 
P (non-central F —c, |y) | exp { i 7а + У s) П dz; П dy; (10.3.2) 
D i=1 D i 


ї=1 


where the domain of integration D is 
ERAI EE ME nok 
(ау) X а Sca У ys 
i-2 i=l 


and w,’s and y;'s are otherwise capable of going from —oo toco. We can thus rewrite 
the right side of (10.3.2) as 


x m = 3+ 

П dz, II dy, exp | (x à? + x 92£)] — 3 ode, V, ...(10.3.2 
sic geet an oxo [aiara М) | eset date}, oso 
yili = 1, ..,n—k) ТЕМ, 


n-k all 
where f? = с, X y?— X wi 
iei i 


i=2 


and only the positive square root of f? is 


supposed to be taken and the ws and y;s vary from —oo to œ subject 


oP 
= Idx; Пау; 
Е) | па, Паи 


ехр [-+(5 a+ "3 vt) Her (- } (МУ? ) —exp ( E Ay 40) 


Remembering that f is. a.e., positive it is easily seen that accordingas 4/ is positive 


to f? always staying non-negative. Thus we have 


or negative (4/y-+-f)? is. a.e., > ог < (—/y--f)*. so that exp (avr) is, a.€., 


< or > exp үт (— Vth); so that QP/O(4/y) is negative or positive. This means 
that P is a monotonically decreasing function of |4/7| and it is easy to check that in 
this case it 0 as | 4/y | — co. 

Thus the right side of (10.3.1) is a monotonically increasing function of each 
s/y; separately, tending to unity as each | уу; |-> со, and the left side of (10.3.1), 
which is the power function of the test, stays greater than this monotonic function. 
As in the previous cases the test is unbiassed at least against all alternatives satisfying 


[P (central F < ¢,)}?* ПР (non-central F < с, |у) < 1—0. — ... (10.3.4) 
i=l 


10.4. The problem of test of independence between two sets of variates. We 
rewrite (9.2.5) as 


pi 
Pla, p. q; т; Уу, Yas es Yp) > 1— T P(non-central r? < ¢,(p, q, 2)| y) 
i=1 


x P(non-central R? < ¢,(p, 9, n)| Yp) Е 


64 THE MONOTONIC CHARACTER OF THE LOWER BOUNDS 


where all r?s are based on d.f. n—1 and А? is the square of a multiple correlation based 
on df, n—1 and q—p and where y, is the largest population canonical correlation 
^oefficient. Notice that for a particular alternative some of the y’s might be zero 
and in any case the y’s vary from 0 to 1. As in the previous case it is well known 
and can be proved in various ways that both P(non-central r<c,|y) and P(non-central 
R2<c,|y) are each a monotonically decreasing function of | /y|, which0 as [vyl 
51, The simplest proof of this theorem can be developed exactly on the same lines 
as in the previous case. But this need not be spelled out here. 


Thus the right side of (10.4.1) is a monotonically increasing function of each 
| ,/y;| separately, tending to unity as each | /y;| 1, and the left side of (10.4.1), 
which is the power function of the test. stays greater than this monotonic function. 
As in the previous case, this test is unbiassed at least againstall alternatives satisfying 


-1 
"it P(non-central 72 < c, y) x P(non-central R? < Ca|Yp) < 1—2. + (10.4.2) 


i-1 


CHAPTER ELEVEN 


Other Monotonic Lower Bounds on the Power Functions" 


11.1. Multivariate analysis of variance test. We start from the canonical 
form (A.7.5.5) and denote by e, the largest characteristic root of (Y, No Yi) 
by H, the H(y; = 0) (i = 1, 2....,8) and by H its complement, and observe that 
for a given positive су, P(c, < cọ| H) = a function оѓу yo s Ve = Va Y Yn Yah 
say. We shall prove that (11.1.1) Р(с, < co| H). i.e.. Wal: Yos :--: У) stays less than а 
monotonically decreasing function of each |у: separately (notice that each y; > 0), 
which is different from the decreasing function on the right side of (10.3.1). 


Proof: We recall from (A.2.2) that the largest characteristic root c, of 
(Y, Yj) (Y, Yj)? can be written as Sup,(a Y, Yia)/(a'Y,Y,a) and the domain 
€, < cy can be rewritten as 


Sup, (a'Y, Y1a)/(a'Y,Y5a) < co, or Malla’ Y, У;а/а'Ү, Уа) < су]. ... (11.1.2) 


Tt is easy to see now that the canonical p.d.f. based on (A.7.5.5) can be rewritten as 
p № y ars 
Const exp S 5а +7 X vi) | © ХАЙД) 
i=l j i=1 j=l 


and region (11.1.2) can be rewritten as 
Sup, [a/(X 4-2)(X" 4-9")a/a' YY'a] < со, or 
(A [a/(X 4-9)X" 4-9')a/a' Y Y'a < co], s. (11.1.4) 
where (p x n4) is such that дү i=j=1, 2, > 8) and — 0 otherwise, and where 
Ү(рхт) = X(pxnj)--9(p Xm) and Yypx nj) = Y(pxn;). A 8.5) 


Notice that all the components of X and Y will vary from —c to oc. Notice also that: 
r = min(p, 14), and з, i.e., the number of non-zero population roots might go up to 7. 
Observe further that the constant factor in (11.1.3) does not, in this case, involve 
the у;'в. The problem is now one of integrating (11.1.3) over the domain (11.1.4) 
(which let us call a, for shortness) and showing that the integral stays greater 
than a monotonically decreasing function of each y; or а, separately. It will 


* See reference [35] in this connection. 
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suffice to show the monotonic character of this integral with respect to variation of, 
say, | Vy]. To this end, remembering that a’ is anon-null row vector (ау, ай, .... 4) 
wemight, without any loss of generality, put а; = 1 and rewrite (11.1.4) as 


=; n,—1 
Cla rtt "E(B aay tO) 
п 
«aX È ам), es LL) 
j=l del 


where à; = Vy, (fi =j = 1, 2,..,5) and = 0 otherwise, and where a, = 1. 
To carry out the integration of (11.1.3) over (11.1.6), we first integrate out over 2; 
and then check the total integral, which we call J,, is proportional to 


Inf, ha 
І, = il f exp (—} ade] өхр[— $ (Ха? +-У 22-4-5у5)] dajdzjdy;, ^... (11.1.7) 
Sup, ios 


the symbols being defined in the following way. For y;'s, as in (10.1.3), i = 1, 2, 
..,p and ў =1,2,..., ng but for vgs i= 2,3,...,p. Also a; = 2) with i= 1, 
2,...,p and j = 2,3,...,4—1, and 


lia = [7 V ifia +f2a] and lj, = [Уу foal ey (11.1.8) 


where Jig a Zati and fo, = [суз a4)? —X(E a(zy+ 8y). ERREN 
J t jt 


Furthermore, (i) the constant of proportionality in (11.1.7) is free from ууз, (ii) 25, 
„жь vary from —o to oo while y;s and 2,’s from —oo to co subject to fs, always 
staying real, (iii) for /,, only the positive square root is to be taken, (iv) f1, and fo, are 
free from уу. Now with a, = 1, let a* denote the value of a for which f,, is a minimum. 
Then it is clear that this a* is free from y, and z;'s and is a function of z,’s, y's, Co 
and possibly also of 0,5. Notice that aj = 1. Also let /,, and lj. stand for the 
values of /,, and /,, on substitution of a* fora. It is now clear that Inmf,l, < lige» 
Sup aloa > loat» so that Interval [Sup ,/,,, Inf ,/;,] < Interval [lans , lias ]. 
Let us now introduce an J} such that, aside from a constant and positive 
factor of proportionality (the same as for 7,) it is defined by 


he 
= f| f охра, exp {Eita + ХИ) 
bj t) 


nae 
II П dary deg dy. 2. (1.1.10) 
TET 


It will be seen that, while J, is the integral of the, a.e., positive function (11.1.3) over 
the domain (11.1.6) which is the intersection of a class of domains, J} is the integral 
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of the same, a.e., positive function (11.1.3) over the intersection of a subclass of the 
previous class. In fact, the subclass is formed by excluding from (11.1.6) all a’s for 
which Inf „ha < la < liar and/or lase < la < Sup aloa This shows that I, < J}. 
It is now easy to check that, aside from a constant and positive factor of pro- 
portionality, we have 
on 
a ois = | texp (he) —exp Hi Trexp( EAH + ә] П блаа 
(Му) i DO ij 


= | fexpl—M fart V ita) exp Us e s fa] 
x[exp(—4(2«à--X dm УЙ П TE dz, dz; dyi» С) 
t tj ID + 3 


by using (11.1.9). The domain of variation of zs, zys and yy’s has been already 
defined immediately after (11.1.9). It will be proved that the expression on the right 
side of (11.1.11) is negative for positive values of уу: and positive for negative values 
of уу, or, in other words, Ij is a monotonically decreasing function of |у. |: 
To prove this we proceed as follows. 

We recall from the remarks preceding (11.1.9) that fo,» is а function of 2,8, 
0,75, со and possibly also of the other 0s, while fias is just a linear function of zj'8 
with a coefficient vector a* which is a function of the same quantities that occur in fo... 
Thus, since zj's are each a N(0, 1), therefore, the conditional distribution of fi,«. 


a ae : ; : Р 
given а*, thatis, given z's and y;j's, is normal with zero mean and variance ož. = > aj”. 
{=1 


Therefore, aside from a constant and positive factor of proportionality, we can 


rewrite (11.1.11) as 


x [о (лут) m [Aue vn] 


1 
х ехр (- gx ft) dfias exp ETERN P dz;dyi. ‚> (11:31:12) 


Integrating out over fia» it is easy to check that the right side reduces to 


f | epl- E y (Vt fe} — exp[- caa f] ] 


+o, 
x exp[ — (2 A+ E f) TL П айд. 2s (1.1.13) 
ij 


Remembering that fo, is, a.e., positive, it is now easy to check that, according 
as 1/7, is positive or negative, we have a.e., (V Yi+fza)?> or < (УУ — zat)’; that is, 


a.e., 


1 


exp eee (yif) < ог > ЭБ у А). 11.1.14) 
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Thus, the integral (11.1.13) is negative or positive according аз уу; is positive or nega- 
tive, which proves that J} is a monotonically decreasing function of each |4/y| 
separately, so that the power of the test stays greater than a monotonically increas- 
ing function of each | /y;| separately and is unbiassed at least against all alternatives 
y;8 for which Ју € 1—a. 

11.2. Test of independence between two sets of variates. We start from the 
canonical form (A.7.3.5), denote by c, the largest characteristic root of (Y,Y Ares 
(Y,Y3( YSY2)(Y5Y,), by Hy the H(y; = 0) (i = 1, 2, ..., p) and by H its complement 
and then observe that, for a given ¢(<1), P(e, < с Н) =a function of y;,..., Yp 
= WalVa: ++, yp) вау. We shall prove that 


Р(с < co| Н), i.e.. Vaya» ---, Yp) S. (112.1) 


stays less than a monotonically decreasing function of each | /y;| separately (notice 
that each y;>0 and <1), which is different from the decreasing function on the right 
side of (11.4.1). 

Proof: It will suffice to prove this monotonicity with respect to any one 
parameter, say y,. ‘Toward this end we proceed as follows. We first rewrite the 
canonical p.d.f. based on (A.7.3.5) in the expanded form 


-y ÉL. Z ehti ntaa t E E 11.2.2 
Const exp [ dE p 2 (xis -- Vi — 2V tu Va) Jm n vit | e (11.2.2) 
by putting Y, — X and Y,— Y (the elements of the latter matrices being vj, and 
Ya) and letting all new variates also vary from —оо to oo. We next use (A.3.9) to 
find a triangular Ü(qxq) such that 


YY' = ÜÜ' andu; = 0 if j > i(j = 2,3, .... q). +. 7(11.2:3) 


We recall from (A.3.9) that given Y, the elements of Ü can be uniquely determined 
by adopting a convention, say, that ш, > 0 (i= 1, 2,...,0), provided that Y is of 
rank q, as it will, almost everywhere, be. Now (see (A.3.15)) it is possible to choose 
an orthogonal transformation: X(pxn) (nxn) = X* and Y = Y (notice that 
although Г might involve Y, yet the Jacobian is 1), such that 


KX’ = X*X*, YY’ = YY'and XY’ — X* [-I(q) Ü' 

Mon (11.2.4) 
000—9 х9) 
This is easily seen if we put Y(qxn) = T(qxq)L(qxn) (where LL’ = I(q), complete 
Ід Хт) into an orthogonal matrix ра ie = Г’, say and then put X = X*T"; 
n 

r , Й L б’ I(q) [j' 
that XX'— X*X*' and XY’ = X* Us CF SRO ere oe Us; .d.f. 
so tha n 2] PM The р... 


of X*, Y ean now be conveniently written as 


n p 
Const exp |—3 ® X hpa) Sub], ve (11.2.5) 
- k=l i=1 i2jel 
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where pj, = 0 if k>i and = \/y,ifk <i. We now put 
(ат — pata) (1 —p3)* = 2% КАДА Ө б) 


(being the elements of a matrix 2(рхл)) and Jg = р/(1—рф)%. and obtain the p.d.f. 
of z,, and уд (which vary from —co to oo) in the form 


Const exp [3 (Ў Ў at X i$]. AREZ 
kel i=l i>j=1 


where w,’s are given in terms of y,,’s by (11.2.3). Notice from (11.2.2), (11.2.4) and 
(11.2.6) that finally 


(А 5 - 9 
(MY Ww mE (2. + Pacta) Cint Punta Y — Pie) — од), s. (11.2.8) 
j n 1 min(,j^) 
(ҮҮ); = @+-бил)(1—рф rug and (Y, Yi), — E aues e (11.2.8) 
= T3 


(íi =1,2....,p3 j, j! = 1,2,....q). Next we recall from (A.2.3) that the largest 
characteristic root c, of (Y, ¥,)(¥,¥2)(Y2 Уз) (Y, Y1) сап be written as 


Sup ,,(a/ Y, Y?b)?/(a'Y,Yja)b'Y,Y;b) and the domain c, < e, as 
Sup , (a! Y, Y5b)*/(aY, Y1a)(b' Y, Y5b) < c a CED D) 
or Marla’ Y, Yjb)/(a'Y, Ү;а)(Ь У, Y;b) < со] 
or alternatively as 
Cala! Y, Y2b)/((a'Y, Уа) У, Ysb)—(a’ Y, Ygb)"} < e (say)] (11.2.10) 


or, using (11.2.8), as 


ФА УКЫ! Му} a 
am РА A 2, а (Za амо) уи) < Co x (2, bjuy)?)j 


Equi X ( X a(zg--fana))]- expression on the left of < in (11.2.11)}. 
ist ded а) 
Now. by using certain standard inequalities, and taking the intersection over 

b, it is easy to check that (11.2.11) reduces to 


Ma [ X (2 а, (Zt Pita) : «6, b (É [S j |; sen (1052512) 


k=1 kei 


in which, without any loss of generality, we can take a, = 1. The problem now is 
one of integrating out (11.2.7) over (11.2.12). the w;'s being given by (11.2.3). To 
carry out the integration of (11.2.7) over (11.2.12) we proceed exactly as in the previous 
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case, integrate out over z and then check that, aside from a constant and positive 
factor of proportionality, the total integral which we call J, is given by [see (11.1.7)] 


таб, 
I, = ( | | exp (— biden | exp ETE] Паг ид. (1,9118) 
Sup ls, 


In (11.2.13), (i) гуу is omitted from zs, (ii) us (for i 52 k) and дв vary from —00 
to co and u;'s from 0 to co, all subject to la and l, staying real and (iii) la and ly, 
are given by 


hs [уу — 7») а А thon) and la, —[— МУЛУ) f s (11.2.14) 
in which fia and f,, are defined by 


Dp 
fa= 2, 42+ Pata) 


2nd „= 5 (5 җы 05 Gut Pate) — (01.215) 
К=й+1 i=l k=2 i=l 

Arguing now exactly in the same manner as in subsection (11.1) we can establish that 

T, stays less than a monotonically decreasing function J; and thus the power of the test 

stays greater than a monotonically increasing function of | V7,/(1—71)|; that is, of 

|y: |; that is, of any | y |; from considerations of symmetry. Also the test is unbiassed 

against at least all alternatives ув for which J; < 1—0. 


There are reasons to believe that the lower bounds indicated in sections 11.1 
and 11.2 are closer than the lower bounds for the corresponding problems, indicated 
in sections 10.3 and 10.4. 


11.3. Test of independence between two sets of variates under the regression model 
of (4.25)-(4.33). It will be observed from section (7.7) that the distribution of the 
roots (and therefore that of the largest root) in this case can be identified with that of 
case (i) of (7.6) when p<q and with that of case (ii) of (7.6) when p>g, in both cases, 
by putting nı = q and n, = n. Tt will be also observed that this identification holds 
for the distributions on both the null and the non-null hypothesis. We have, there- 
fore, exactly the same kind of monotonicity property in this situation as in the case 
(11.2) and no separate proof need, therefore, be given for this case. 


11.4. Modified test for the equality of two dispersion matrices against a special 
class of alternatives. We take over from (6.4.5) the acceptance region for the 
hypothesis Hy : Уу = Хз and rewrite it as 

е1 (Р Ti, Ng) < all св < Cos (Ds Nis na), e. (11.4.1) 


where c,, and c,, are so chosen as to satisfy 


P(c, < alles < с, |5, = Xj) = 1—9 s. (2114.3) 


er 


MODIFIED TEST FOR EQUALITY OF TWO DISPERSION MATRICES 71 


апа 
[9P(e, all ©, 8<Coq| 34 7 zu] | 
£ ду; cn vem vay Mel А-дор Lo 
OP(61, Czas Vas о Fo) 
or ле ig? ли ор = i= 1,2 
| ду, b jx cp d (i — 12, p 
(11.4.3) 


remembering that if E; 4 E, the probability P is, aside from the degrees of freedom 
n, and m, and the limits с, and Cogs purely a function of y;’s, the characteristic roots 
of X,Xg'. It will be shown here that Жу =... = Yp y (say) (which means that 
X, Ly? itself is equal to yl(p), i.e. XQ = yXg. then P monotonically decreases, i.e., 
the power of the test monotonically increases as this common y tends away from 1 
which is the value of y on the null hypothesis Y, = Xs. 


Proof: We start from the canonical probability 
ul 
| Const] 2 ] exp [= tr Ds Xi Xi +X,X,)4X, AXo, 


where the constant factor is a pure constant not involving the parameters, D, stands 


for a diagonal matrix whose diagonal elements are 1/7;, .... 1/у,, X, and X, are рх, 
and рхт, (р<т, ng) and where the сгв of (11.4.1) are the roots of X,X1(X$X$). 
We first show that the p equations under (11.4.3) are really equivalent to one 
equation. To prove this we note that aside from the constant factor, 


? " " " 
P= (y)? exp[—i tr (Dy, X345 + XoXo) 
cig € all ХХХ)" < cae p 
xüX,dX, (144) 
Hence 
oP 2 tra 
um ham * 2» 30v, XX] 
e) ora < all ra XIXA] € Coa 
x exp[—Àtr (Din; X, X1 -X,X5)MX, dX. 4.5) 


Now using the transformation 
X (pxm) = Utpxp)D px PLP xm) 


and Хурхт) = Up xp) Lapa), (11.4.6) 
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‘where U is non-singular and L,L, = ‘L, Ly = Цр), and integrating out over Lı; and 
Ly. we observe that, aside from a positive and constant factor of proportionality, 
(11.4.5) reduces to 


oP 


nı 
arp = | Aa [BI Dy] o (042) 


m—2 


x` exp[—$ tr (Dim; UD», VERE "Р av fi Cae de, П (6—0), 


where D, stands for a diagonal matrix with diagonal elements сі. ..- €p and the 


domain D is 


Cra < all cs < cg, and — oo < all u's < oc. ves, (11.4.8) 
We have thus 
oP _ = nı HUD, 0"), | exp [—} tr (UD, U'+UU' 
{ад h= ШЕ KUD, U^] exp [—4 tr (UD,0'+ UU 
ny -p-l 
x jute du fice *® de Ho (e—. (14) 
ja DEED 


Having regard to the definition of the domain given by (11.4.8) and the structure 
of the integral on the right side of (11.4.9) itis easy to check that this integral is in- 
variant under a change of the subscript 7, so that the expression on the left side of 
(11.4.9) is the same for i = 1. 2,....p and hence the p equations (11.4.3) are equivalent 
to really one equation. Now adding p formally different looking integrals like the right 
side of (11.4.9) over i = 1, 2, .... p and cancelling a factor we. e that (11.4.3) is really 
equivalent to 


+ = 
[[175* оро exp [—} tr (UD; uuu Ul aU 
D 


y ъ—р—1 
x По * de, П (4—6) = 0. 2. (11.4.10) 
i> 


i=l 


Tt is easy to check that the left side of (11.4.10) is the same as if we had put all ys = 
y in (11.4.4) and then differentiated the integral with respect to y and then put y = 1. 
As will be presently scen this will enable us to rewrite (11.4. 10) in a simpler form (which 
can also be derived in a straightforward though rather lengthier manner). At this 
point, remembering the definition of D by (11.4.8), we merely observe that (11.4.10) 
gives one relation between ĉie and cs, which we call the condition of local unbiassed- 
ness and then (11.4.2) added to this determines cia and cy, completely. 
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Going back now to the problem of proving the monotonicity of the integral 
on the right of (11.4.4) under the special assumption that y; =... = Ур = У (say), 
we proceed as follows. 


Putting D, jy X, = Y, and X, = Y, we note that (11.4.4) reduces to 
i 


jim exp[—4 tr(Y, Y +Y, Yj)dY, dYa ... (114.11) 
Cig € alle (Dy, Y Y! (УЗУ )*1] < cza 
Now putting all у; = у it is easy to check that this reduces to 
p exp[—3tr(Y,Y]--Y,Y,)] dY, dY, „а (11.4.12) 
ба < all of ¥i¥4 (УЗУ) < H 


We are thus back on the problem of the distribution of the characteristic roots on the 
null hypothesis and we have, therefore, aside from a constant and positive factor of 
proportionality not involving the У, 


усас nonne 
P= Е с © dallta) * Ja (c,—¢;) 
Cil Y S01 & <ер<заЇу s esl 
exal Y °з Cy 
= f { [X feo sag Й de,, eee (rican) 
i=1 
where 
ie m —p-—l mins. * 
Fleis as Cp) = П E ? (+e) ? | П (c;—6;) = П ci" (14-c;)^ П (о; бу) (вау). 
{=1 i» i-i i-i 
(11.4.14) 
It is now easy to check that 
Сза[у ©р-1 es У 
OPE Caa Ста di 
am e ong gt fem) H ds 
Cial Y. CialY cial Y 
озү ор сз A A 
е 1 i 
E pace sfe: ee Ti de, (11.4.15) 
CialY Vial Y ial Y 
Csal Y e$ ue -2 —1 
_ On Coal у)" т ж SE Jh. cS fegato 
EA ULT | Lis | д SEO eur Ae "s (9) = p a} 


hi а" а} азу Уй 


€2a] Y es У 
Lh [RC 2 I | it a+)” de; in Gen) ü nw 
{1 n i Eu i-2 i»j-2 i-a n 
y tal Y ла 
= ky) hos), вау. 
10 
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The condition of local unbiassedness is that 
E40) = E141). s. (11.4.16) 


We shall now show that subject to (11.4.16) the last expression on the right 
of (11.4.15) > 0 ify > land < 0ify <1. The proof will thus be complete if we can 
show that according as y > 1 or < 1 we have 


C A n 
(+9) nn (в) ie, > (еи BO op оа А0) 
(1 2E Cza)" IXy) бф , ; Ten 11) li ЕСИ Ly 

24 

(11.4.17) 
Now according as y > 1 or < 1 we have 
Cia Coa А 
( + eie 2) > or < (14-619) /( 1+ Ca). ss {IBIS 


Thus (11.4.17) will be proved if we show that J,(y) is an increasing function of y and 
I,(y) a decreasing function of y. 


Now 
чш (ла)? Tana CLE Oa) =: CoalC2a Y)” j ss (11.4.19) 
(1/у) (1 E Cia д(1/у) (1 at. C20 
ne ~ 
where Z stands for the positive quantity 
CaalY Cpi es = E —1 
| T ete” de, "Ho (e—e) "П («- E 
t=2 i> j=2 i=2 y 
Cial Y CialY Cial Y 
x T (s = c) (%® — 53. s (01.4.20) 
ie \Y Y 7 
It is thus easy to see that 0) < 0 and D < 0, so that P of (11.4.13) mono- 


tonically decreases or the power of the test (6.4.5) monotonically increases as y tends 
away from 1. 
11.5. Modified test of the hypothesis that a population dispersion matrix has 


a given (matrix) value against a special class of alternatives. We take over from (6.4.4) 
the acceptance region for the hypothesis Hy : X = X, and rewrite it as 


Calp, n) < all eis < с (р, n). АА (1325.1) 
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where ci, and c4, are chosen so as to satisfy 


P(c,, < all cs < c;,|X = 23) = 1—« 09) 
апа 


OP(c,, < эй сгв < 0, |2 5 Bp) 
а 2 ), —0(-—,2 р). 11,5.3 
| js | aaa lee (i p) (11.5.3) 
3 OP(6,,. Cas Yas +++ У) 
or — Aa? "2a* (12 77? Fpl —0( = 1,2 $ 
[ ду; а ( p 


Here the сгв are the characteristic roots of = (XX')Zq1, y;'s are the characteristic roots 


of ХУС! and X(pxm«)(p < n) is the reduced observation matrix. Exactly along the 
same lines as in the previous case it can be proved that (i) the p equations (11.5.3) 
are really equivalent to one equation and that (ii) if y, = Ур =... = Yp = у (вау), 
in other words if XXj! = y, i.e., E = УХ. then the P of (11.5.3) monotonically 
decreases, i.e., the power of the test monotonically increases as y tends away from 1, 
which is the value on the null hypothesis. 


11.6. It can be shown by very lengthy and tedious calculations that the 
two tests considered in 11.1 and 11.2 for multivariate analysis of variance and for in- 
dependence between two sets of variates as also the modified tests considered in 11.4 
and 11.5 for one and two dispersion matrices have each of them the monotonicity 
property, and not just the near monotonicity property which has been proved in 
chapters 10 and 11. But this lengthy proof is not being offered, in the hope that a 
much simpler and more elegant proof may be forthcoming in the near future. 


CHAPTER TWELVE 


Least Squares and Univariate Analysis of Variance and 


Covariance with Multivariate Extensions 


12.1. Statement of the problems. Let x(n 1) denote a set of n uncorrelated 
stochastic variates with the same (unknown) variance c? and let (х) be subject to 
the constraint: 

E(x) = A(n x m)&(m x 1). е) 


where m <n and £ (m x 1) is а set of unknown parameters (to be estimated) and A is 
a matrix of rank r < т < n, whose elements are given by the particular experimental 
design. 

Problem I: Given a non-null с'(1 хт) (subject to certain restrictions to be 
brought out in (12.2)) and given x, it is required to obtain for с” % a linear estimate 
b'(1x n)x(n:x 1) such that (i) Z(b'x)—c'£ (for all Е) and (ii) variance (b'x) is to be a 
minimum. c'£ will be said to be linearly estimable (or sometimes just "estimable") 
if and only if (i) is satisfied. 


^ 
Problem II: Given c' and x as above, it is required to obtain Ё so that 


^ ^ 
(x’—#'A’) (x — AE) is a minimum. Tt will then be incidentally verified that b'x of 


Problem I — сї of Problem П. 

Problem III: To the model of Problem I add the further condition that the 
хрв are independent N(E(x;,o0?) (i= 1, 2, ... n). Let usnow try to obtain (in terms 
of given elements) the customary F-test for the hypothesis 


O(qxme(m x 1) = 09x 1), ve (12.3.2) 


where + < m (r being the rank of the A-matrix of (12.1.1)) and C is a given matrix of 
rank s < min (r, q). 

The model indicated, under which the hypothesis (12.1.2) is tested, is usually 
called the linear hypothesis model, or in more recent years, the model I of analysis of 
variance. The hypothesis (12.1.2) is ealled the linear hypothesis. Now going back 
to (12.1.1) we observe that for the usual types of experiments when they do not involve 
regression, A is a matrix whose elements are ordinarily 0 or 1. For experiments 
which also involve regression on the so-called "concomitant variates’ which аге really 
certain observations supposed to stay constant within the probabilistic set-up of the 
experiment and the analysis, A is a matrix some of whose elements involve these 
“non-stochastic” observations, the rest of the elements being pure constants, mostly 
0 or 1. That will be called the general regression set-up under the so-called model 
I of analysis of variance and covariance. 
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12.2. Solution of Problem I. Assume that А/(т хт) is such that 4j(r» n) 
can be taken as a basis and let A'(m x n) of (12.1.1) be factorized into: 


r [Aj r T 7 
55 L(r x n). won, SURED) 
m—r LA; m—r LT, 


n r 


such that LL’ = I(r), and let L,((n—r) xn) be an arbitrary completion of L in the 
sense of (А.1.15), so that 


Notice that L, is not unique. Also observe that 


L LL LL, L 
I(n) — Eine = 105: 01] —LLEDLL... (12.2.8) 
ie DL AD: N ТА, 


Furthermore, with an A having the structure (12.2.1), let (12.1.1) be rewritten as 


х АЙК АЙКА 
B(x) = n[A, : Ad : 2. (12.2.4) 
Ln m—r 
DER a 


1 [ei : сз] kl | j vs (19.2.5) 
tJ m—r 


rm—r 1 


and let c'£ be rewritten as 


Now condition (i) (of unbiassedness) of Problem 1 of (12.1) becomes 
: | Eara | 
[ey : es] = B(b'x) = b'E(x) = b'[4 As] 2 (12.2.6) 
& Éo 
= b'(4, & +4 Ё). 
and, since this is to be true of all £, and £,. we should have 
b'A, = c; and b'A, = cj, (То 207) 


which imposes a number of restrictions (<m) on Ъ(1 xn) but by no means fully 
determines b' (which has to be determined). 


Substituting in (12.2.7) for A, and A, from (12.2.1) we have 


ЪТ! = cf or b'L/ = с) 1, and b'/T = сз. ФС Ң(12.28) 
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Now to minimize V(b'x) subject to (12.2.8) we proceed as follows: 


V(b'x) = o?b'b (since x is an uncorrelated set with a common varianceg?) ... (12.2.9) 


= ос! (Ў!) -Ӯ,) 1с, РЪ'Та L,b] (using (12.2.8)). 


The minimum V(b'x) is thus reached when 


b'L, = 0, s. (12.2.10) 
so that, combining (12.2.2), (12.2.8) and (12.2.10), we have 
b' = с(@,)-11, Т) 


апа hence 
b'x = с!(0,) Lx = e(P1)(7,)- 47x (using (12.2.1)) 
= с) 1А; = САА) 141х. wee (12.2,12) 
This gives the *unbiassed minimum variance" estimate of c' £. 
Restriction on c'. Now. going back to (12.2.8) we have 
с = b/A, = c (Ñi) LA, = eq.) Ў) А; A, (using (12.2.1)) 


= c;(4; А1) 14143. 9:18) 


We have thus that. in order that с, i.e., [Cj : с] H may be “estimable” (in the 
2. 


sense indicated), с must be related to c; by (12.2.13). which can be expressed in 
another form that is more suggestive. From (12.2.1) we have 


A, = LT; = A) Ty or A, = TT) 143, s. (12.2.14) 
which on substitution into (12.2.13) yields 
ci — c( 4741) ALA (11) 37, = e(T1)775. сз ЧЕР 


Thus с; is related to су by the same post factor by which А, is related to A,. 


Invariance of the linear estimate (12.2.13) under choice of Aj. Tf, instead of 
A1 and сү, we choose another set of independent row vectors, say А; and ej to match 
it, then in place of the right hand side of (12.2.12) we should have the linear estimate 
given by replacing the subscript 1 by 3. But remembering that 


Air xn) = Тут хт) L(r xn), s- (12.2.16) 
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where T, is obtained by picking out from the right hand side of (12.2.1) the rows 
corresponding to A’, and is necessarily non-singular (since A’, is of rank 7), and using 
(12.2.15) and (12.2.16), we have 

(Al Ag) ASK = e) ТТ) ЗАА) T TAx 
= (MATT) AT (ALA) AT (Ls) T40,)3A1x = e(4141) Ax, ... (12.2.17) 


which proves the invariance. 
Variance of the “unbiassed minimum variance" estimate. From (12.2.9), 
(12.2..11) this variance is given by 
V(b'x) = o?b'b = oe} (71) ALL'(7,) 1c, = ос (T, T.)3c, = ос (АА) 16, 
(12.2.18) 
which again by the method of the previous paragraph, can be shown to be invariant 
under choice of 4j. 


12.3. Solution of problem 11 or the “Least squares solution”. 
67 Е a А 
Ак Ав) = or An^: Lu [Е кав) 
1 
=[х'—& (71) 2] D: Lr] [x-z : T | 2s (0123.1) 
E [2 = (z) Шх—(@!: Т +х' Lix 


(using (12.2.3)). Tt is now quite easy to see that given x and A the minimum value 
(х'—#4')(х— 48), under variation of £, will be attained if 


Lx —[T1: Ty. 2. (12.3.8) 


If we now want the "least squares estimate" ct of an “estimable linear function" 
с'Е, we have from the above: сї = сЁ с, = ДЕТИ А (from (12.2.16) 
= с), Т.) = c(i) Ix (from (12.33) = eq) (7,)4'x (from 
(12.2.1)) = с1(414,)-'41х, which proves the identity of the “least squares solution" 
of an “estimable linear function” with the “unbiassed minimum variance solution”. 
12.4, Solution of problem ILI. It is well known that 
if x(n x 1) is a set of n uncorrelated N(H(x), 01(п)) (and thus also independent variates) 
and if L(px») (p < n) is subject to LL’ = Цр). then Дрхт)х(п х 1) is a set of p 
uncorrelated N(LE(x). o?I(p)) variates. .. (12.4.1) 
Tt is also well known that 
ifu(px1) is an independent N(0, o?) set and so is v(qx 1) and if u and v 
are mutually-independent, then u'u/g? is a X? with p degrees of freedom, у'у/а? 
is а y? with q degrees of freedom and qu'u/pv'v is an F with degrees of freedom 


p and 4. (12.4.2) 
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Going back to the model of Problem HI in (12.1) and to (12.2.1)-(12.2.3) we 
observe that 
if x(nx1) is an uncorrelated N(Z(x),o?7(»)) set. then  L(rxmn)x(nx1) is 
an uncorrelated N(LE(x). o?I(p)) set and L,((n—r) x w)x(nx 1) is an uncorrelated 
NİL E(x): стн) set which is also independent of the Lx set, since LL, = 0. 
(12.4.3) 
Now from (12.1.1) and (12.2.1)-(12.2.3) we have E(x) = Tipps ТЛЕ, so that 
L, B(x) = L,L'(P,: ТДЕ = 0. Thus Гах is an independent N(0. g?) set, whence it 
follows that we have a y? (with n—r degrees of freedom) given by: 


x'L'L,x/o® or x'(I(n)—L'L)x/o? or [x'x—x' A(T, T1) Ajxl/o®... (124.4) 
or [x’x—x'A,(A{ A1)! A;x ]e?. 


Consider now the hypothesis C(g хт) (т x 1) = 0. where C is of rank s < min (q, 7), 
r being the rank of the A-matrix and thus being < тт. Let us rewrite the hypo- 


s [0n Cy Bal t 
=з} .. (12.4.5) 
а—в E Og 0% et тт 


nin 1 


thesis as 


4 ОС : 
where [U,, Cg] are a set of s independent row vectors and) on e a matrix, each 
„Съ ©з 


row of which is of the nature of c/ of (12.2). In this case the hypothesis (12.4.5) will 
be said to be ‘testable’. 
It is now easy to see that the hypothesis Cg = 0 is eer to PM 
— 0, so that we shall work in terms of this latter. Going back to (12.2.12), (12. 2.3 
and (12.2.15) we note that С 
Ci = Oai T; .. (12.4.6) 
and 0 = Obi + Crabs = ЕГО (АА) 1 Aix] = Е[С,ү( ТОЛ Mf Lx] ... (12.4.7) 


Now (7; 1, js vu Lis arn matrix of rank r and C, is a sx matrix (s < 7) of rank 
s. Then using (A.1.6) we note that С (ТТ) T L, which is a sxn matrix, must be 
of rank s < min (q, *) (note that r < m < n). Let 

Cua TTL = V(sxs)M(sxn). s. (12.4.8) 


where MM’ = I(s), and Ў of course is non-singular. Then we have 


Е(Мх) = (V) ELO, ,T4)3 T, Lx] = 0 wee (12.4.9) 
(from (12.4.7)) and furthermore 
ML, = (f)30,U, TLL = 0, T (12.4.10) 
so that 
Mx is à s-set of independent N(0, o?). 4x(of (12.4.9)) is a (n—"7)— set of independent 
N(0,c0?), Mx and L,x are mutually independent. .. (12.4.11) 
and hence 


(n—vr)x' M' Mx|sx' L; Lx is an F with degrees of freedom s and n—r. ... (12.4.12) 
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Using (12.4.4), (12.4.8) and of course (12.2.1) and (A.3.11), we can reduce 

(12.4.12) to 

(n—r)x'A(Ay 43)305[05(0414:)01]10,4(4141)7 41x (12.4.18) 

s[x'x —x A1(4,141)-1A1x] bass Tt 

which is an (with degrees of freedom s and n—r) for testing the hypothesis CE = 0 
and which is expressed in terms of quantities directly observed or given by the experi- 
mental design and the hypothesis to be tested. The form (12.4.13) can be shown 
to be invariant under the kind of choice indicated in (12.2), i.e. under the choice 
of a basis of A, in much the same way as there. 

12.5. Conditions that k different linear hypotheses may be testable in a 
quasi-independent manner. Suppose instead of the hypothesis (12.1.2) we have the 
following hypotheses: 

C ?(g; т)Е(т х1) = 0(g;x 1), with i = 1, 2,..., Г, wee (12.5.1) 


or breaking down into submatrices,, we have in place of (12.4.5) the following: 


3i Е bad ү r ў 
= 0 (g,X 1), with == 1, 2, ..., k. 


—в LOR C8. & J m—r 
т т? 1 (12.5.2) 


Now by (12.4.12) the i-th hypothesis of (12.5.1) will have an F, to go with it, where 
Ё, = (n—r)x' MUMxJsx'Lilax, .. (12.5.8) 


and is distributed as an F with degrees of freedom s; and n—r. Also хх о? is a 
X? with degrees of freedom s; and x' Lj Гух/о is a y? with degrees of freedom n—. Tt is 
clear from (12.4.10) that each X? (i =1,2,...,%) is distributed independently 
of y2. The question is, when are the y?'s themselves mutually independent? If 
these are so. then the associated linear hypotheses (12.5.1) will be said to be testable 
in a quasi-independent manner. Going back to (12.4.8) we observe that x? and д? 
(i 52 j) will be independent if M, M; = 0, that is, if 

Prio Py P LL Pt Tog Vj = 0 ss (12.5.4) 
or since Ў, and T, are non-singular, if 

C((434,)7(414,)(4; Ay)? OR = 0 

or O(ALA,) CQ = 0, with i Aj = 1, 2,..., k. .. (12.5.5) 
This, therefore, is the set of conditions for the linear hypotheses (12.5.1) being testable 
in a quasi independent manner. From a practicai standpoint it serves, when an ap- _ 
propriate breakdown of the sum of squares is not intuitively evident, exactly the same 
purpose as Cochran’s theorem does when such an appropriate breakdown is intuitively 
evident. 

. 12,6. A quasi-multivariate generalization of the problems considered in (12,1). 
Suppose that in (12.1) we assume that x(n 1) denote a correlated set with a p.d. 
dispersion matrix a?Y(nx n) where o? is unknown but È is supposed to be known 

il 
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and suppose that (12.1.1) is left unchanged. Also in problem III let us assume 
that x is N(H(x), o?X) but let us leave (12.1.2) unchanged. Then putting 


X(nxn) = Tnx п) (п Xn) and T>(n x nyx(nX 1) = y(nX1), ... (12.6.1) 
it is easy to check that y(n x 1) is a set of uncorrelated variates with a common varianec 
o?, and also that if x is N(E(x), a?X), y is N(E(y), o?/(n)). Also in terms of y, (12.1.1) 
reduces to 


Ely) = f-14t. P12 6:2) 
It is now easy to check that (12.2.12) reduces to 
сд) 3477" 3y = (AEA, 1410 1y = c (A134,)14;X7x. 
(12.6.3) 
Also (12.4.13) similarly reduces to 
(nx E TAA STA ACO AE AQ) Cad? Ca (3 14,) 1413 x 
gfx’ Ox —x УА (А15 144) 14127 x] i 


(12.6.4) 

Itis easy to verify that, under this model, the ‘estimability’ condition (12.2.13) 
and the ‘testability’ condition (12.4.6) will stay unchanged. The necessary modifica- 
tions in the other expressions will also follow in an obvious manner. 

12.7. Multivariate generalization. The set-up for multivariate analysis of 
variance and covariance, i.e., for a test of the general multivariate linear hypothesis 
is an easy and direct extension of what has been considered so far in this section. 


In place of the set-up of section (12.1), consider the more general set-up of 
(iiic)-of chapter 5, which is the following. Let X(p x) (with p and p(p--1)/2 n) be n 
independently distributed column vectors, the r-th vector x, (px 1) being N(E(x,), >) 
(r= 1, 2,...,n). In place of (12.1.1) we have 

Е(Х') (ъхр) = A(nxm)&mx p). naa) 12:7.) 
where £ is a matrix of unknown parameters and A(n x m) is given by the design of the 
experiment such that it is of rank r<m<n. We recall the observations in connection 
with (12.1.1)and note that here also, for the usual type of experiments where they 
do not involve regression, A is a matrix whose elements are ordinarily 0 or 1. For 
experiments which involve regression on the so-called “concomitant variates”, A is 
a matrix, some of whose ¢.ements involve these “concomitant variates” or non-stochas- 
tic observations, the rest of the elements being pure constants mostly 0 or 1. Also 


AY 
A'(mxn) = | | : 2. (12.7.2) 


Й 
455 m—r 
n 


setting 


let 4; be a basis of A’, i.e., of A. 
In place of the hypothesis (12.1.2) we shall have the hypothesis 


O(g x m) &(mx p) = 0g X p). 2. (12.7.8) 
where. as before, 
Quo Cis 158 
CO(g xm) = | | i .. (12.7.4) 
m Cad 9—8 


n em 


——— 
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such that Cis of rank s, [Cu : Ci} forms a basis, and also that C satisfies the 
testability condition of the nature of (12.2) which will be here 


Cils = в [Cy ; 
[Aq(r x n)A( x r)]3Aq(rx n)As(nxm-r).. ... (19.7.5) 
04 q—s q—8 21 
m—r 
Just as, following (12.4.5) we observed that the hy ,othesis C(q x m) &(m x 1) = O(g x 1) 
«= [Cy : 01] £m x 1) = 0(sx 1), so also it is easy to check that O(g x m) £(mx p) 
r m—r 

= O(gx p) > [On : Cre] E(m x p) = 0(8 хр) 

f m—r 


Ter : ; 
ог Hy:E'(pxm)- [| | = Üpxs).- =. (01276) 
Cid m=r 
8 
We also observe that 


Cp 
Hy of (12.7.6) —5 (^, Hy, = (^). [a (1x p) tom] ] =0'(1х5)],... (12.7.7) 
Опо: m—r 
8 
where (7), is taken over all non-null a(p x 1). 
Now, using (12.4.13), we have for Ho, a critical region of size, say J, given by 


(n—r)a" XA (AA ClOn A) ЗОО (АА) Аа o ays nr) 
s[a'XX'a—a' XA (4;4,) 14} X'a] = à 
(12.7.8) 
where Fg(s, n—r) is the f-point of the F-distribution with degrees of freedom s 
and n—r. Now, as in (iii) of section (6.4) of Chapter 6, using the extended type I 
principle, we have for Н, = ("), Hoa, the critical region of size a( > f) formed by the 
union of the regions (12.7.8) over all non-null a(» x 1), the region being given by 
с, > Calp, в, n—r), s (19.7.9) 
where ¢ = min(p, в), с, is the largest characteristic root of S*S-1!, and 
88* = X A144) 3C 4[05(4 4,)1035]05(4:4,) 144X"* ... (12.7.10) 


and (n—r)S = [XX' — X A(A14,)314^,X]. “© (ЕЛДИ) 
This largest characteristic root has the same central distribution as that of the 
largest characteristic root that figured in (iii) of (6.4), with degrees of freedom p,tand 
n-r. 
The development given above really subsumes an apparently more general 
development in which the hypothesis (12.7.3) is replaced by 
C(g x m) E (mx p)M(p x wu) = O(q x), ОТИ) 
where u < р, М is а given matrix of rank и and C has the same structure as before. 


The equation (12.7.6) will now be replaced by 
013] 7 1 
Hy: M'(ux p) 5 (pxm) | ] —O(uxs)  ... (12.7.11.2) 
Ciz J m—r У : 
8 
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That the development of this case is really subsumed under the one already 
discussed can be shown in the following way. We note that if X(p xm) (with p < n) 
be n independently distributed column vectors, the r-th vector х(рх1) being 
N(E(x,), Er = 1, ...,») then M'(uxp)X(pxn) will be n independently distributed 
column vectors, the r-th row vector being N(E(M'x,), WZM). Putting 
E(m x p)M(p x u) = E* (mxu), we can now replace (12.7.1) by 

E(X'M)nxu) = А(пхт) E* (mxU), ~ (12.7,11.3) 


and (12.7.11.1) and (12.7.11.2) respectively by 


O(q xm) E*(m x wu) = (qx u) ТоТ 01:4) 
and 
Cat 
E*' (ux m) | ] = O(wx г). s (12,711.60) 
Oj, J m—r 
E 


It is thus easy to see that for the hypothesis (12.7.11.4) or (12.7. 11.5) we can, in exaetly 
the same way as before, work out, step by step, a test of the same nature. In (12.7.8), 
X(p xn) is to be replaced by M'(wx p)X(p xn); in (p, $, n—r) of (12.7.9), p is to be 
replaced by u; and іп c, of the same equation f wil now stand for min(w, 8); also in 
(12.7.10) and (12.7.11), X will have to be replaced by M'(uxp)X(pxn). In subsequent 
developments (specially in connection with confidence bounds related to multivariate 
linear hypotheses on means) it will be understood that we ean always switeh over . 
from H,:0& = 0 to H,: CEM = 0, and back and forth. Thus the mathematical 
treatment given there will suffice for this apparently more general case. 

The direct reduction to the canonical form of the problem of the joint distri- 
tribution of the roots c, & c4 ... «c, and hence of that of c, is of some interest here, 
and it also proves incidentally the statement made after (12.7.11), which of course 
can also be proved otherwise. For this reduction we proceed as follows: 


P(X) = Const exp [-4 tr E-(X—E(X)X—E(X)HX, ... (12.7.12) 
= Const exp [—4 tr X (X —£&' A'(X' — AE) dX, 
using (12.7.1). 
Next, using the factorization (12.2.1) and the completion (12.2.2), we have 


5 к 
A(nx m) = (пх) : Talr, and ап |. | ] P x (5.2.13) 
r m—r 1 


Now use the orthogonal transformation 
Z(pxn) = ХШ i Ly] = [Z i Ур (say). i.e., 


T n—r 
L 
xS IZ YI И = Z,L+YL, 2s (0277.14) 
to obtain | е © 
"En 3 
P(Z,, Y) = Const exp[—} tr x3(Z— E x ] YZ;i—I2; : T: Е + YY }dZ, dY. 
5 (12.7.15) 
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Notice that the unbiased minimum variance estimate of [Cj : O,o|Z(mx~p) is 
С11(4; 4,) x AjX’, so that under H,: E[Cq(A; 4,14; X'] = 0. Also we have 
€5(4141) 44, = 10107 7:)-2T,L = Cy, T3 L. Now put 


Css x r)T1(r xr) = V(sxs)M,(sxr), where ММ; = I(s). ... (12.716) 
M,]s 
Also complete M, into an [| ; 
M,d r—s 
r 


Next use the orthogonal transformation 


24рхт) = АДМ; i My) = [Y* i Yi] p( say), іе, 
8 7—8, 


Z, = Y*M,+Yj}M,, and notice that 
Y* = Z,M; and Yj = Z,M;. SUR DX ro 


Tu N 
24 M; and 7 = £' M; ATIT IBY 
T, T, 


We now have for (Y*, Yi, Y) the distribution 


Similarly put 


y 


y* 


1 
P(Y*, Yi, Y) = Const exp[—} tr Z((Y*M,-- Y2M;)—£ | J 
T, 


x (MLY* М У), : THE)+ YY ded Ys dY 


= Const exp [—4 tr E! ((Y*—5*)(Y* —y*)+(Yi— mY] — 97) + YY Hd Y*d Vid Y. 
(12.7.19) 


Integrating out over Yi, we have for (Y*. Y) the joint distribution 
P(Y*, Y) = Const exp [—} tr Z ((Y*—75*)(Y*' —5*)--Y Y')dY*dY. ... (12.7.20) 


Notice that Y(pxn—r) Y'(n—r xp) is, a.e., p.d. (assuming of course that р < n—r) 
and Y*(pxs)Y*(sx p) is, a.e., at least p.s.d. of rank / = min (p, s). Also check that 
YY’ = ХХ = the right side of (12.7.11) and Y*Y* = Z,M;M,Z, = Zf 
x0 Pap- 015327 = XTi eu Vy Cuff, LX'-— the right side of 
(12.7.10). Furthermore, check that 


T, 7, "m um , 7i А 
а= № = 6 Тоц = (10657) 101,0- (say) 
: T, T, 
= (iOa tC), 
so that, if CE = 0, i.e., if £jCj,--E;Cj = 0, then 7* = 0. Also notice, after some 


calculations, that 
tye! = 0s +ESCi On (41407 05] Cnr Cub). © o (12.7.21) 
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Thus, (12.7.20) may be regarded as a quasi-canonical form of the joint distri- 
bution problem of the roots, and now, using the same technique as in (A.7.5) and 
denoting by Yj and Y, the transforms of Y* and Y (recall that the roots are 
invariant under this transformation), we have for Yj(px5s) and Y(pxn—r) the pro- 
bability law 


TOO. г 
P(Yy, Yo) = Const exp [—} [7,4 Г | + Yo 

0-710: p—t 
t p—t 

ЭУ Ор 

21; (pxs) | | ауда, ш. (12.7.22) 
0 0J s—t 
t p—t 


where, as usual, D, stands fora diagonal matrix whose diagonal elements are the 
roots y's of the matrix 7* 7+7! (some of whick may be zero). Now notice that 


D, 0 1 D, 0 t 
tr = X y; and tr У = У (Youve, .- (2.7.23) 
0 0 i=l 0 0. i=1 


and rewrite (12.7.22) as 


Const exp [—} tr (FYV +Y Y+ E y,—-2X (Yaya ДҮҮ, =. (12.7.24) 
i=l і=1 


which is, therefore, the canonical form for the joint distribution of the roots in the 
general case. 


Ta another monograph under the title “Least squares and analysis of variance 
and covariance” which will be a sequel to this one, use will be made of the formulae 
of this chapter to obtain the customary tests and estimates relating to the standard 
classes of designs in the context of what is called model I. Adjustment of the general 
theory given here to the situations of the other models and the derivation of some 
actual formulae involved in the analysis of some concrete situations there, will also 
be discussed in that sequel. In actual application we repeatedly run into the problem 
of inverting matrices which have certain kinds of pattern. Methods will be discussed 
of obtaining inverses of these patterned matrices in a very simple manner without 
having recourse to Doolittle’s method or any other such method, These latter while 
extremely useful for general matrices, can be luckily dispensed with so far as these 
particular patterned matrices are concerned. 


The above set up is specially useful for a general discussion of linear estimation 
or testing of linear hypothesis, although it also leads, without much calculation, to the 
formulae for the different customary designs. However, there is another set-up 
tobe discussed in the later monograph, which gives the different customary formulae 
in an even easier manner, although this is not so suitable for a general discussion, 


CHAPTER THIRTEEN 


Some Univariate and Bivariate Confidence Bounds* 


13.1. Some general observations. The general theory (to which nothing is 
added here) of confidence bounds like the general theory of testing of hypotheses and 
tests of significance (with a part of which the previous sections have been concerned) 
has been worked out in a series of papers, now classic. This is readily available not 
only in papers but in standard books as well and need not be explained here. How- 
ever, except for some comparatively recent work, most of the earlier applications 
have been concerned with confidence bounds on a single parameter or a single function 
of the parameters. Simultaneous confidence bounds on several parameters or para- 
metric functions offer nothing new in principle, being already inherent in the general 
theory and will not, therefore, be diseussed here from the point of view of the general 
theory. In this chapter several examples from univariate normal populations and one 
from a bivariate normal population will be discussed (some of them simultaneous and 
some of them “single’’) which will prepare the ground for the multivariate examples 
(all of them simultaneous) to be discussed in chapter 14. In this chapter, in every 
case except one we shall start from a eurrent test of the corresponding hypothesis 
(having a number of optimum properties in respect of power) and obtain by inversion 
single" or "simultaneous" confidence bounds which, therefore, by the general theory, 
will have similar optimum properties in respect of shortness, i.e., the probability of 
covering wrong values of the parameters or parametric functions. 

13.2. Means of normal populations. 

(i) For (Ё, с?) we have, in terms of a sample of size n with sample mean z and 
sample standard deviation s, the following well known confidence interval for £ (with 
a confidence coefficient 1—2) 

85i, (n—1)/ /n & & < рві, (п). 2 824) 
where fa/o(7—1) is the upper 2/2 point of the ordinary /-distribution with d.f. (n— 1). 

(ii) For N(é,, 0?) (h = 1, 2) we have, in terms of two samples of sizes ny 
with sample means and sample standard deviations z, and в, (A = 1, 2), the following 
well known confidence interval for £, —£, (with a confidence coefficient 1—a) 

(2—2) — 8t, (n—2) | уз < E — 5 < (74 —75) d-5fa/,(n— 2) | / mig, ... (13.2.2) 
where n == mme, 82 = [(n4—1)s]--(mg —1)s2]/(n— 2), nig = ny najn and tap(n—2) is 
the upper 2/2 point of the ordinary /-distribution with d.f. » —2, i.e., n -na — 2. 

(iii) For confidence bounds relating to £,'s of N(E,o?) (h= 1,2, ..., k, 
where k > 2) we proceed as follows. Suppose we have random samples of sizes ny, 
sample means 2,, sample standard deviations s, (№ = 1, 2,..., k). Put 


2 2 E k 
n= У т 8 = У (пъ —1)8/(0—), 2 = X mEn, s" = Y n(z, —z)*/(k—1), 
het hen к=з Xa |. 


MES эё. PLB ase) 
hel 


* Seo references [27, 28, 29, 44, 49, 51] in this connection. 
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For the hypothesis Ho : B= € = есте = (8 = 1... Б), we have at 
a level of significance, say 2, the current F-test with a critical region 


P = з*%|з% > FAk—1, n—k) 2. (13.2.4) 


where F (k—1, n—k) stands for the upper x point of the central F-distribution with 
d.f. (k—1) and (n—k) (we recall the well known result that when H, is true s**[s? is 
distributed as the central F). When Н, is not true, it is easy enough to see that 
s##2/<2 ig distributed as the central F, where s** is given by 


2 
=] 


k z 
gem = X n kr tE) ED. .. (18.2.5) 
D 
Suppose that we now start from a statemer.t with probability 1—0, namely 


k 
ges? < F (k—1, n—k), i.e., È пу, €)*/(0c— 1)? < F(k—1, n—k). 
=1 
(13.2.6) 
it is easy to check (see (A.2.7)) that the statement (13.2.6) the following statement. 


аел) вав -2,п 086 X от 175-9 < 070 EH n—k)} 


any? (E, E) + 


È Mæ 


k k 3 
or X апу (ау-8) 900—1) Fb 1, 0—0] < E ат, <, 
5i h: 


һ=1 


+[((&—1) К(®—1,®—Ё)}, ЖОБА ОЛ) 


for all arbitrary а,в subject tox az=1. (13.2.7) is obviously a set of simultaneous 
confidence bounds on all arbitrary linear compounds of п —®) (h = 1,2... k), 
the compounding coefficients a's being subject to 2 аў = 1. Tt is also easy to verify 
that the set of such linear compounds could be otherwise written as 


k E k 
5, ату), for all ays subject to X a = l and а= 0. o (13.2.8) 
heal һ=1 hel 


(iv) For confidence bounds in the case of the general linear hypothesis we 
proceed as follows from the set-up of chapter (12). Suppose we have 2/8 (№ = 1, 2, 
.. n) which are n independent N(E(a,), o?) such that. putting х'(1хт) = (21.25: 
2, ®„), we have 
E(x)nx1)— A(n x m)&(m X 1), .. (18.2.9) 
where m <n, A is a matrix of rank, say r << m < n, given by the experimental situa- 
tion and & (mx1) is а set of unknown parameters. 


‘Putting A'(mxm) = E ] f r let us assume, as we can without any loss 
„1т— 


n 
of generality, that Al(rxn) is a set of independent row vectors which might be 
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taken to be a basis of A'(mxn). Suppose now that it is required to test a 
“testable” hypothesis 


O(g x m)&(m x 1) = 0, ae (13:2310) 


where C is of rank s < min (q,7) < m < n. 


Putting 


EE Cy Ci Ет 
C(gq хт) от x1) = i mU (19:2: L1) 
q—s L On Cn éa m—rT 
Е 


m—r 


assume, without any loss of generality, that [€5,| C1] can be taken as the basis of 
© and notice also from chapter (12) that for ‘testability’ we should have the further 


condition 
C, |$ в [Cu E 
= [Air xn) A (nx r] Arx n)Asq(n x m—r7). 
Gy, 1 q—8 4—8 L Cg i 
m—r r 


(13.2.12) 


We recall from (12.4.13) that the current F-test for (13.2.10) (at a level, say a) has 
a critical region given by 
(n—r)x A (414) 261116104141) Cu] ^ Cy( 4141) A1 > P(s,n—r). 
[хх —x A (414) AX] Е 
(13.2.13) 


Recall that when (13.2.10) is true, the left hand side of (13.2.13) has the central F- 
distribution with d.f. s and n—r. Assume next that (13.2.10) is not true, but what 
is true is 
С(а x m)&(m x 1) = qx 1) (9 being given), or say 
sf Cn Cs ECT т)». 
КА i аР) 
q—s LOy Cry Ё - m—r 152 9—8 
T m—r 1 1 
Tt is evident from the theory of linear equations that N, could not be just arbitrary 
but that it must be related to 9, through the same matrix prefactor through which 
Cy, is related to Оц. and Cs, to Cis. 
Then proceeding exaetly as in chapter (12) we check that if in the left side of 


(13.2.13) we replace x(nx 1) by x(n 1)— B(nx s) 9,(s 1), the resulting expression 
is distributed as a central F with d.f. s and n—r, B being given by 


B(nx s) = Ayn xr ALA (0x 7) Yr x Su A143) Cn] (8 X 8). ЖОО (0832.15) 
Tf in (13.2.13) we now replace x by x—By, and also put 


[Cs (A14)710513 = Osx 90 6x9 eo (13.2.16) 
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(notice that by (A.3.7) О is determinate and also unique), then it is easy to see exactly 
in the same way as in the previous case that the resulting statement & the 
following 


X'A(414,)305, Üa(s x 1) (ЕР) Е (s. n—r)]t 
< qyi(1xs) B'(sxn)A, (414,307,Ua < x'A4(,4,)3 04, U2--LEV ][sF (s. n—s)]}, 
(13.2.17) 


for all a subject to a'(1x s)a(s x 1) = 1, where B is given by (13.2.15) and Ü by (13.2.16) 
and the error variance HV by 


EV = [x'X—x'A1(A14,) 1 A;x]/(n—r). wae, (18,218) 
Substituting for B' from (13.2.15) we check that 
БА АА ЛО = 7. „+ (18,219) 
Also putting Ü(sx s)a(sx 1) = b(sx 1) and using (13.2.16) we note that 
L= a'a = b'Ü'-15 = b'(ÜÜ')3b = b'[C,(474,)-10,] b... (13.2.20) 
The statement (13.2.17) thus reduces to the following: 
X'A(414,)- 105 b —(EV)i[s F(s; n—r)! < y; b 
< X'A(Aj 4,)-1C 4 b--(EV)I[sF.(s, n—r)]* эз (18;221) 
for all b subject to (13.2.20). 
If we go back to(13.2.20) and reasonas in (iii), it is easy to check that(13.2.21) implies 
sle* (BV JHS F (в, пг) О АА) 205] 99, 
< sis*--(EV)i[sF.(s. n—7)]t. ws» (18.2.21.1) 


where ss** is the “sum of squares due to the hypothesis", given by the numerator of 
(12.4.13) with the factor (n—7) taken out. (13.2.21.1) is thus a confidence statement 
with a confidence coefficient > 1—2. 


This, therefore, is a set of simultaneous confidence bounds (with a joint 
confidence coefficient 1—2) on all arbitrary linear functions of 3,. It is easy to see 
that (13.2.21) subsumes as special cases, the confidence statements (13.2.1), (13.2.2) 
and (13.2.7). Nevertheless, for expository purposes, it is worthwhile to discuss 
separately the simpler cases first. 


Two other particular examples of (13.2.17), of special practical interest are 
also discussed, separately, in (v) and (vi). (v) Suppose we have y,’s (A = 1, 2, ..., n) 
each being an N(4,. с?) such that cov(y,. Yn) = po*(h +h = 1, 2, ..., n), where 
p is known, but 0, and о? are unknown, but an independent estimate s? of о? based 
on n’ degrees of freedom is available. It is required to obtain a set of simultaneous 
confidence bounds on the mean differences 


0,—0,,, with h, k = 1,2,...,n, k £h. -. (13.2.22) 
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We have now a finite set of parametric functions. Let z,4-X8 = y,-- X4 


where J = (y;--Ja-l--..-y,)[m. 8 = (0,5-0--...4-0,)/ and the disposable constant 
X is so adjusted that the z,’s are uncorrelated. Then 


E(z,) = 0, var(z) = o*(1—p), with h = 1, 2,..., n. 2 (19:2:23) 
Let 
Zn — Pn) — (£j — Orr 
Va E RE M, with Б, = 1, 2,...,n,h AW. а. (1822.24) 
Then Wan < d, .. (18.2.25) 
implies Yr yn —в4\у/1—р < 0,—0, < ууу tsdy 1—p. .. (18.2.26) 


Let W, be the intersection of the regions (13.2.25). Then clearly the necessary 
and sufficient condition for the sample point to lie in W, is that 


w 


qi : 
утә < 


d, .. (13.3.27) 


where 
W = виру (2,—06,) — (zy —0y). With h, w = 1,2,...,m HAW ...(18.2.28) 


Thus if we set d = q,(n, n^), where q,(n. n’) is the upper a-point of the distribution of 
the studentized range with n, »' degrees of freedom, that is the ratio of the range of 
n independent normal variates with zero mean to the square root of an independent 
estimate of their common variance based on n’ degrees of freedom, then the required 
simultaneous confidence intervals for the parametric functions (13.2.22) are ‘ 


Yr Yor — 81,0. n) T—p < 0,—0, < уь ува (п, n')v/1—p. ... (13.2.29) 


In particular у, Уз. -... Yn may be the means of n random samples of equal size drawn 
from normal populations with a common (unknown) variance, or may be the estimated 
treatment effects in a randomized block or a balanced incomplete block experiment. 
(vi) In factorial experiments we are usually interested in estimatinglinear functions 
of treatment effects whose estimates are independently and normally distributed 
with a common variance which can be independently estimated by an appropriate 
multiple of the error mean square in the analysis of variance. The distribution needed 
for simultaneous estimation in this case is slightly different from that occurring in (v). 


Suppose, for example, that we have observations for a 2x 2x 2x 2 factorial 
experiment with factors А, B, C, D, and that we are interested in simultaneously esti- 
mating the main effects and two factor interactions only. We shall suppose that the 
experiment is so laid out that none of these is confounded in any replication. Let 


tyr» taps tag» tag denote the true main effects and typ. tig: tia» tog, tes. tag the true two factor 
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interactions. Тһе order of the subscripts in t; is immaterial, that is, tj = Lj. We 
can then write in the usual notation, 
by = (1/8)\(a—1)(6+1)(e+ (d+ 1), ы. (13:2130) 
ta = (1/8)(а—1)(ф—1)(с-Е-1)(4--1), 2. (13.2.31) 
with similar expressions for other main effects and interactions. Let y; be the esti- 
mate of t. ‘Then reasoning as before we get the following simultaneous confidence 


intervals for ty: 

yi— Sen, п) € ty € yg-- sen. т’), 25:95 (13:2:32) 
where 82 is an estimate of V(y;), based on n’ degrees of freedom available for the esti- 
mate of error, and where », which is 10 in this particular example, is the number of 
linear functions to be estimated. 

The meaning of a(n. n’) is as follows. Let жү. te: ..., t, be independent nor- 
mal variates with zero mean and variance o?. Let |x| be the maximum of |а|, 
[2| ..... |] and let s? be an independent estimate of a? based on n' degrees of freedom. 
Then x,(n, n^) is the upper z-point of the distribution of [а | /s. 

In a factorial experiment in which each factor is at more than two levels, 
the above will still apply if the n linear functions to be simultaneously estimated (or 
tested for vanishing) are so chosen that their estimates are independently distributed 
with a common variance. 

13.3. Variances of one or two normal populations. 

(i) Given a random sample of size n-+1 (mean: # and s.d.:s) from an N(E, с?), 
we take over from (6.3.1) the following statement with probability 1—0: 


Nia(n) < nst[a? < х5, (т), Е (15:31) 


where y2,(n) and ё, (п) are the upper o4 and lower a, point of y?-distribution with 
d.f. n and g is partitioned into оу and a such that (а) «+2 = x and (b) the comple- 
ment of (13.3.1), i.e., the critical region is locally unbiassed (in the neighbourhood 
of с) in which case it has also been shown to have the monotonicity property. We 
now rewrite (13.3.1) as 

пех, (т) < a? < ns? [yi (п). es (13.8.2) 


which gives confidence bounds on g? with a confidence coefficient 1—2 and having 

properties in terms of shortness similar to those possessed by (13.3.1) in terms of 
the second kind of error, already discussed. 

(i) Given two random samples of sizes n,-+1 (mean: 7, and s.d. :s,) (A = 

1,2) from two N(£,.o$), we take over from (6.3.2) the following statement with 

probability 1—0: 

st [оў 

Тыт, n3) < 1 I < Кыт. n3). 2.2 (13.8.8) 
Mass 72 

where Р, (71: ne) and (эу. na) are the upper о; and lower a points of F-distribution 


with d.f. n, and n, and a is partitioned into a, and a such that (а) o-I-o, = œ and (b) 
the complement of (13.3.3), ie. the critical region is locally unbiassed (in the 
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neighbourhood of g,/c, ) in which case it has also been shown to have the mono- 
tonicity property. We now rewrite (13.3.3) as 


Fy q(t, nj) € n < 


a ei Fial; no), s. (13.3.4) 


Vise 


which gives confidence bounds on o3/o3 with a confidence coefficient 1—0 and having 
properties in terms of shoriness similar to those possessed by (13.3.3) in terms of the 
second kind of error, already discussed. 


13.4. Coefficient of regression for a bivariate normal population. Let 21 and 
ж» be distributed as a bivariate normal with variances o? and e$ and correlation 
coefficient p, and let the sample variances (on a sample of size n+ 2) be denoted by 
s? and 58, and the sample correlation coefficient by r. Also let 5,4 = s,r[s, and 
fa турт. It is easy to check that the variates (жу — 2925) and x, are uncorrelated, 
so that when the population parameters are оү, €» and p, mir*/((1—r*?) has the 
i-distribution with n d.f. Here r* stands for the sample correlation. between 
(а, fit) and wp, that is, 


r* = (8; 8 1 —By982)/(82 2B 128182? + F183)! 82 


= (sir Вава) isr — pis) 01—72) 8 = (bs Вз) 068—2) HL 9? 8/821, 


(13.4.1) 

and, therefore, r*[A/1—r*? = 5 n 2 s. (18.4.2) 
Now consider the statement 

tdn) < nir*((1—7*9) < ta(n), s. (13.4.3) 


where /,(n) gives the upper a/2-point of the i-distribution with n d.f. This is easily 
seen to reduce to the following confidence statement on fy (with a confidence 


coefficient 1—2): 


Ae sm) =e < fis < bat 7 ажа... QM) 
By inversion of (13.4.4) the test that we obtain for the associated hypothesis Ho: (з 
— 0. that is 0.20; is easily checked to be the customary test based оп ‘r’ and hence 
just. the" T-test. Similar procedures would go through for "partial regressions" or 
^multiple regressions”. The interesting point here is that it would be far more 

dence bounds to p, because this would have to be 


difficult to give corresponding confi 
done by inverting the distribution of the noncentral r, which is quite complicated. 


13.5. Difference in mean values between two variates having a bivariate 


normal distribution. Let 


24 éi с 910p 
| ] be N ; п ‘ 
Wy be 01050 0% 
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Then since (жу —a) is Nč —£5), (014-03—20,0,p)]. itis well known and easy to check 
that in terms of a sample of size n with sample means Z, and z,. sample variances 
ѕ? and 82, and sample correlation coefficient r, we have the following confidence inter- 
val for £, —£, (with a confidence coefficient 1—a) 


9, —2,— 5l, (n — 1)| 1/7 < 1—5 < 2—7 + sta a(n Dv ... (18.5.1) 


where fg/.(n—1) is the upper 2/2 point of the ordina ry t-distribution with d.f. (n—1) 
and s? = 81--52— 278,89, Mathematieally this is no doubt deducible from (13.2.1) 
and is also a special case of (13.2.7) but this is so very important in practice that a 
separate and explicit statement may not be out of place. 


18.6. Ratio of variances of two variates having a bivariate normal distribution, 


3 e с? 00р 
be N А а 
e £ 01030 o$ 
Then for amy constant A, it is easy to check that covariance (2 Аз, 24А) = 
(а) АУ (а). Thus this will be = 0 if A2 = V(2,)/V (а) = 0/05. Thus, with a 
positive A = 0/0, (2 —Àx,) and (2, -- A5) will be uncorrelated and hence for a sample 


of size n, with sample variances sj and sj, and sample correlation coefficient v, 
Vn—2 7*/(1—r**)! has the t-distribution with d.f. n—2 where 


Let. 


7* = sample correlation between (2, — Àx) and (2, +A) 


ll 


(sf — A88) [(8-- A98] + 22,57) (52-- A988 — 22r] 

= (80 — А288) [sf + A55 + 2025253(1 —2r2)]t 2 (13.6.1) 
Thus starting from the statement with probability 1—2 
| V/n—2r*|(1 —r*2) | < t.) (m—2), + 13:012] 


and remembering that A = с/с and substituting (13.6.1) for 7* in terms of s}, s3, 7 
we have for с/с the following confidence bounds (with a confidence coeficient 1 —a) 


si 2 ER 2 = i E 
2 [a+ ae p l т) [а | cy #2). 179) i} ] = Я 
si 2 Tem rg = + 
RE [e+ fare TA} а 2. tare 1-1] | Fen (13:65) 


The cases discussed in 13.5 and 13.6 are relevant from a physical standpoint where 
we have two comparable correlated variates, for example, the measurements on the 
same characteristic of a set of individuals before and after the administration of a drug. 


a 


a 


CHAPTER FOURTEEN 
Multivariate Confidence Bounds* 
14.1. A convenient notation, From now on we shall make use of a rather 


convenient notation. A random sample of size n from a p-variate normal population, 
le. an X(pxm) having the p.d.f. 


(27)-?3|X|-"? exp [—} tr E (X—E) (X’—E')] 
where 2(р x n)stands for a p n matrix each column of which is the same рх 1 vector 


Е (with components £,,..., čp) will be referred to as Х(рхт): №*(Ё,, X). A matrix 
Y(pxn) having the p.d.f. 


(27)-72| 3|"? exp [—2 tr X ^ Y Y'], 
will be referred to asY(pxn): N*(0, X). We recall from (4.4) —(4.14) that starting 
with an X(px(m--1): N*(%, Х) and transforming and integrating we can always 
have an Y(pxm): N*(0, X), such that 
nS(pxp) = Y(pxn) Y'(nxp) = X(px (n--1)) X'(n--1)x р) 
—(n4-1) X(p x 1) €'(10 р). PORE CIA DAT) 


where x(px1) = т X(p x n) Цих 1), Yn x 1) being an nx 1 column vector with compo- 
nents (1, 1, ..., 1). 


14.2. Confidence bounds relating to the mean vector for a multivarite normal 
distribution. Given an X(px(n+1)): AN*(E,X), suppose we try to obtain simultaneous 
confidence bounds on arbitrary linear compounds of the population mean vector £. 
' Consider the statement that 


(n-- 1)! | a (x — £) | (a^ Sa)! 
or (n+1) а(х —#) (к —£')a/a'Sa < c?, vee (14.1) 


where € is the sample mean vector and S is the sample covariance matrix, and 
a(px1) is an arbitrary non-null nonstochastic column vector and c is a given 
positive constant. The statement (14.2.1) stems from the customary Student's 
t-test and the associated confidence interval (both having well known optimum 
properties) relating to the parameter a'£. Now, for a given (positive) c and 
given X, Ё, S and of course n, the set of all statements (14.2.1) for all possible 
non-null vectors a is exactly equivalent to the statement that 

sup, (n-+1) a(x—£)(x' —£')a/a/8Sa < œ. 2. (14.2.2) 
It is well known that this "sup" comes out as tr (n-- 1) SC(X—£)(X' —£/) or as tr(n+1) 
x(x'—£) S-(x—E) (since tr AB = tr BA), or simply as (n4- 1)(x' —£') S(x—£) 
(since tr scalar = scalar). It is also well known that under the null hypothesis, this 
is distributed as the central Hotelling's 7? with d.f. p and n+1—p and that if 
in this statistic £ is replaced by #*(-48), the resulting statistic is distributed as the 

* See reference [44, 45, 46] in this connection. 
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non-central Hotelling’s 7? with the same d.f. and with the non-centrality parameter 


y? = ( “'—Ё)® (E*—E). Going back to (14.2.1) it is thus easy to see that if, 


(па Ен) Еж )а Еа x: 
P| p: - «eli =a Куш; va Es 


then c? = T?is the upper a-point of the central Hotelling’s 7?-distribution with 
d.f. p and n+1—p and can be conveniently written as T?(p,n+1—p). From (14.2.3) 
we have thus, with a confidence coefficient 1—a, the set of simultaneous or multiple 
confidence bounds (for all % and all nonnull a): 


a'x—[TX(a'Sa)/(n--1)! < a'g < a'x--[Ts(a'Sa)/(n--1)]5. wee (14.2.4) 


It should be noted that (14.2.4) gives the simultaneous confidence bounds on all arbi- 
trary linear compounds of the p components of the population mean vector Е. The 
shortness (in the sense of probability) ofthis set of confldence bounds, that is, the pro- 
bability of these bounds covering £* when, in fact, £* 52 &, is obviously 


1—P [noncentral T? > T? 


From the well known fact that the power function of Hotelling's 7?-test is a mono- 
tonically increasing function of the nonnegative т, it follows, therefore, that the short- 
ness of the confidence bound (14.2.4) tends to zero as т — oo. 


Let us go back to (14.2.4) and choose a’ so as to maximize a'£. Then it is easy 
to see that (14.2.4) implies that (ЕЕ) < (xx) -H[TZ/n-4-1)]el, (S). A 
similar result follows for the other side of the ies and thus (14.2.4) should imply 


(о 1) eb (8) < EEE < (IHT - Des (8). .. (14.2.5) 


which, therefore, is a confidence statement with a confidence coefficient > 1—a. 


Back in (14.2.4), if we cut out the i-th element of a, the corresponding 
element of x and £, and the corresponding row and column of S (with i = 1,2,...p) 
and reason in the same manner as in the case of (14.2.5) we have p truncated con- 
fidence statements of exactly the same form as (14.2.5). Likewise, cutting out any 


two elements say the i-th and the j-th (i = j = 1,2,...p), we have (2) truncated 


confidence statements of this form, and so on. Thus altogether we have 2’—1 
confidence statements, on which the leading one is (14.2.5). all with a joint confidence 
coefficient > 1—2, which, in a sense, provides a complete analysis of the problem. 


14.3. Confidence bounds relating to mean differences in k multivariate 
distribution. Given Х (р х (n,+1)): N(£j, E), (k = 1, 2. ..., k) let us try to obtain a set 
of simultaneous confidence bounds on all arbitrary double linear compounds of the 
p-components of the k population mean vectors measured from the weighted grand 
mean vector. Consider now the statement 


E | 
| ba (m+ DIK —R—-B +B) < [Uc Dg'a' Sal, s (14.3.1) 


ы 
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where X, is the mean vector for the h-th sample, 


k k k k 
х= 2 (n,4-1)x;/ E (m+), & = > (m+ DE ® (m +1), 


h: ħ=1 


LU 


where S is the pooled “within” covariance matrix of the k-samples, given by 
k К 7 di ^ 
(3 т) = > 1,090), 
haa h=1 


and g is a given positive constant, a(p x1) is an arbitrary non-null non-stochastic 
k 

column vector and the 6,’s are arbitrary coefficients subject to У bj = 1. 
hel 


If we now use the result that 
k k 
Es Din, < say to ys < d. 
5 | = 


then it directly follows that, given all the other quantities including a, and under 
k 

all possible variations of b,’s subject to У b; = 1, the statement (14.3.1) is precisely 
hel 


equivalent to the statement that 


X lamt 1), —&—E,--E)E/U—1)a/Sa < g, 


or 


k 
E а'б-+Е1)(®,—х—&,+®Ю(%—х'—&-+-&')а/(&—1)а'8а < g^ s. (143.2) 
Letting now a vary and putting 


(—1)8* = À (паку х БВК БЕ), s (433) 


the statement (14.3.2), for all possible values of the non-null a, is precisely 
equivalent to: 
sup, [a’S*a/a’Sa] < g*. s. (14.3.4) 


As observed after (6.4.7) S is, a.e., p.d. and S* is, a.e., p.s.d. of rank q = min (р, k— 1) 
(p.s.d. if pz» k—1 and p.d. if p <k—1) and sup,{a’S*a/a’Sa] is just the largest 
root c, of the p-th degree determinantal equation in е :|S*—cS |= 0. Of this 
equation all roots are nonnegative, p—q of them always zero and q are, a.e., 
positive. Thus (14.3.4) and hence (14.3.2) under #1 permissible variations of a and the 
b,’s, turns out to be equivalent to: 

« gt. s. (14.3.5) 


13 
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The distribution of c, on the null hypothesisis known and relatively easy and involves 
k H 

as parameters p, k—1, У n, Computation of the 5 per cent and 1 per cent points 
D 


is in progress . Thus if - 
z Ple, < c,|null hypothesis] = 1—2, e. (14.3.6) 


k г TE 
we can write c, = ¢,(p,k—1, У ту), and now combining (14.3.1)-(14.3.6) we have, 
А=1 


with a confidence coefficient 1—2, the following set of is confidence state- 


k 
ments (for all 2,’s, all non-null a’s and all бз subject to 2 бу = 
h= 


k 
X b,a'(n,-- 1) (x, —¥) —[(k—1)c,a' Sa]? 
1 


k k 
< E b,a'(n,4- 1) (£, — E) < E b,a'(n;4-1)(x,—X)--[(E—1)e,a Sa], ... (14.3.7) 
i-a 11 


k 
where c, = Calp; k—1, P т). 


This gives simultaneous confidence bounds on all arbitrary double linear compounds 
of the p components of the difference between the k population mean vectors Ё. and 
the weighted grand mean of these which is . To discuss the shortness of (14.3.7) 
consider the non-central distribution of c,, where c, is defined after (14.3.4), i.e., с, 
is the largest root of the equation in c: 

|S*—cS| = 0, where S* is given by (14.3.3). e. (14.3.8) 
It is easy to see that the distribution of the non-central c, is really the distribution 
of e, where e, is the largest root of the equation in e obtained by (7) replacing in (14.3.2). 
£j, and Ё by £i( A £j) and &*( 7 &) and (ii) substituting the resulting value of S* in 
(14.3.8) and (iii) assuming that the true population parameters are £, and. The distri- 
bution is extremely difficult but is well known (see section7.6) to involve as parameters 
the positive roots yı, ..., y(s < min(p, k—1) of the determinantal equation in y: 


i 
| X*—y3|-— 0, where У is the common covariance matrix of the k populations and X* 


k 
= (ф—1) X (4-05 —8*— EHE — Et EHE). This X* is necessarily at least 


p.s.d. of rank min (p. k—1) = s(say), so that out of the p roots of the equation in y, 
p—s are zero and s positive, Using (9.2.3) and referring to section (11.3) we observe 
that there is a good upper bound to the shortness of (14.3.7) and that the shortness is 
a monotonically decreasing function of the deviation parameters and tends to zero as 
these tend to infinity. With two populations (and samples), we have g—min(p, 1)—1, 
and thus only one positive sample root, say c, and at the most one positive pops 
root, say y. It is easy to check that in this case 


(m+ Y) ng 1) tr 8-1 


po nj bnyd3 2—5 + &o)(Xj — X — E; +85). 
— (nu 1)(u 2-1 polyps ps MI EE ш, 
y = tuta D ЫНЫ Ы-ЫЫ, (14.8.9) 
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and it is well-known that, on the null hypothesis, c is distributed as central Hotelling's 
T? with d.f. p and nm,4-ng--1—p, and on the alternative as non-central Hotelling 's 
T? with the same d.f. and with a deviation parameter y. It is also easy to check that 
in this case the confidence statement (14.3.7) reduces to 


"e UNS RC HE tgal? Te 
ama- (а rsa] < a 6t) 


‚ M+Not2 mea + 
aec. yv era урга бао; Bee SE LO) 
fi) Гат E ] | 
where T? = T2(p, п-Еть-Е1—р) is the upper a-point of Hotelling’s 7?. Тһе short- 
ness of (14.3.10) is exactly known and of course tends to zero as y — 00. 
14.4. An important subset of the set of bounds (14.3.7). Suppose now that, 
k 
instead of all contrasts of the type: X 0,a'(m,-+1)'(%,—®) (with given restrictions on 
psi 


а and the 6’s) , we are interested in contrasts of the type :a'(£, —&j), for all non-null 
a’ and all h 41 =1,2,...,%. It is easy to offer a multiple set of confidence bounds 
for contrasts of this type, which can be regarded as one kind of multivariate (under 
unequal sample sizes) analogue of a somewhat similar set given by Tukey for the 
corresponding univariate situations, and diseussed in Section (15.2). The proposed 
set is built up as follows. With the same notation as before, and with 


ты = (ny4-1) (m+1)/(m m2) note that 
Tj, = ny (ЁРЕ) (XE +B) = 
ny Sup, a' (X, —X,—&,-- E); —*i—By + &)a/a' Ba. 


Thus, for a given pair (h, 1), the statement that T}, < T2 is exactly equivalent to 
the statement that, for all non-null a's, 


a'(£,—x)—[T22' Sa[ny]! < a (Ej, —&) < a'(x,—x))--[T2a' Sann]. 
We observe that when the true population means are £s, Т, is distributed as 
Е 
Hotelling’s T? with df. р and E m,+1—p. 
h=1 


Now, considering all pairs (h, 1) out of k-samples (and k-populations), it is easy 
to see that the statement: all 73s < T2, is precisely equivalent to the statement 
that the largest 77, out of all pairs is < T2, which again is equivalent to the state- 
ment that, for all non-null a's and all pairs (h, 1) out of k, 

a'(X,—X) —[T2a' ај < a'(&j —&) < a' Gt; —x)--[Tza Salm]. -.. (14.4.1) 


If the confidence coefficient of (14.4.1) is to be 1—«, then T, = T (p, m; nos +++ n) 
will be given by 


k 
P [largest T7, out of ( ) pairs > 7?| null hypothesis] = а, ... (14.4.2) 
2 
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It is obvious that the distribution of the largest 77, involves as parameters just p 
and жү, ng, ...m. It is easy to see that the distribution is manageable only when 
the number of parameters is small. In particular, the case that n—ma ==... = 1 
and p = 1, is identical with the one considered in Section 13.2. It may also be noted 
that when k = 2, the largest Tj, will of course be Hotelling’s 7? distributed with 
d.f. p and n,--m4--1—p. Also the shortness of the confidence bounds (14.4.1) can be 
formally written as 

k 


P [largest T7, out of ( ) pairs < T? (p, ny, Mg...) Ng) | alternative]. 


9 


It is important to observe that while each T7, is individually distributed 


k 
(on the null hypothesis) as a central Hotelling’s T? with -d.f. p And nrc 


the (3) Ts are not independent, nor do we know what the distribution of the 


largest central T, is, to say nothing of the non-central case, so that the confidence 
statement (14.4.1) has not been reduced to practical terms as was done for the other 
eases discussed in this section. The distribution problem arising in this situation is 
now under investigation. 

For the associated problem of testing H, : £y =... = Ej, we set up as before 
the rule that if, for all non-null a and all pairs (A, 1), the bounds (14.4.1) include zero, 
we accept Ho and reject it otherwise. The properties (including power),of this test 
are tied up in an obvious manner with those of the multiple confidence interval state- 
ment (14.4.1). \ 

Notice that so far, in testing of hypotheses [um inyersion of Confidence state- 
ments, we have considered two-decision problems. Suppose, at this point, for pur- 
poses of illustration, we offer a multi-decision procedure, namely that, for à given 
pair (A, 1), we accept or reject H(£, = £j) according as all those bounds (14.4.1) which 
involve x; and х, only include or exclude zero. It is obvious that in all other situations 
considered so far we could set up similar multi-decision procedures. 

14.5. Further observations. Та many situations it might be of greater physical 
interest to be able to make, instead of (14.3.7) or (14.4.1), a set of just px (5) 
confidence interval statements, each relating to just one variate and difference betwen 


one of B pairs. In other words, if Ё, = (£j. Ёз, ..., yn) (В = 1, 2, ..., k) denote 


the p-means for the h-th population, then we would like to makea statementof & Ee form 
Finns (Ж. X, Xp) S Ey < Fay (X1; Xy vty Ma) зз. (14.5.1) 


(with obvious applications to subsection 13.2), for all h 4h’ = 1, 2, „Е and all 
jq 21-0125 iere Sin and Р. are supposed to be two different ficti of the 
whole set of px 3 QD raw observations. It F Oper that (14.5.1) is a subset of 


(14.4.1) which again is a subset of (14.3.7). Whether Из is possible to make a statement 
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like (14.5.1) in an elegant and useful way (i.e., with manageable functions fjj, and 
Рол) and with a given joint confidence coefficient 1—«, that is, free of the nuisance 
parameters X, is still an open question. It may well be that a range (not too wide) 
for the confidence coefficient itself is called for. Furthermore, whatever set of confi- 
dence intervals like (14.5.1) we propose, be it under a fixed confidence coefficient or 
under a confidence coefficient lying in a short range, the "goodness" of such a set 
would pose further questions. It is believed that in this situation a more promising 
approach might be one involving a suitable two-stage procedure. 

14,6. Confidence bounds connected with a general linear hypothesis. In place 
of the set-up of Section (14.3) consider the more general set-up of (iiic) of Chapter 
5, which is the following. We have an X(p x n) whose column vectors are independent- 
ly distributed, the r-th vector x, being N(E(x,), E) (г = 1, 2, ..., n). It is also given 
that E(X^(nxp)- А(пх т) (тх р) where čis a set of unknown parameters and 
A(nxm) is given by the experimental situation such that it is of rank r < m < n. 
Also setting 


Ае 

A'(mxn) = | ] д ... (14.6.1) 
AJ m—r 
n 


let А be a basis of A’, ie. of А. Next consider a matrix C(qxm) of rank 
s < min (0,7) < m — n with a structure given by 


€ Cy] £ 
C(qx m) = s. (14.6.2) 
Сы Cod 9—8 
т 0—7 


such that [C,, О] forms а basis and also that C satisfies (13.2.12). Then combining 
the results (13.2.9)-(13.2.21) with the results (14.3.1)-(14.3.7) it is easy to check that 
the following is a set of simultaneous confidence bounds (with a confidence coefficient 
1—a), 


a (cp) X(pxcn)4 n аА) цео) (ева х1) 
—(a' Sa)i[sc, (p. 8, n—r)] < аруу (рх s)b(sxc D. 
< a/X A444)? 0; b-- (a Sa)[se,(p в, n—7)]* eur (14:63), 
for all non-null a'(1 xp) and all b(sx 1) subject to 
ГОА 44) бы] b = 1, is (14.6.4) 
where q(sxp) is given by 


[Cy б] (mx p) = nlex p). м. (14.6.5), 
T m—r 


and ¢,(p, s; nr)! is the upper æ point of the distribution of the ue root tof (6.4.7), 
under (12.7,3) with d.f. (р, s, n—r). ME А x 
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The confidence bounds (14.6.3) are thus seen to be really on arbitrary double linear 
compounds of [C,, С]. 


If we go back to (14.3.1)—(14.3.7) again, and normalize b(s x 1) into Ü (sx s)b*(s x 1) 
where b* is a unit vector and ЁЁ” = [C,,(414,)1C1,]-!, then it is easy to see that the 
statement (14.6.3) will imply 


cha (88*) —[ses(p; в, пт) (8) < eb (C1(4:43) 105) 1] 


< Cmax (88*)--[s.(p; 8, n—r)]ich..(S). ». (14.6.6) 


where S* and S are the dispersion matrices due to the "hypothesis" and due to the 
“error” and are given respectively by (12.7.10) and (12.7.11). (14.6.6) is thus a con- 
fidence statement with a confidence coefficient > 1—02. 


Also harking back to the remarks made after (12.7.11.5) we notice that if 
CE— 0 were replaced by C£ M = СЁ* = 0 and [Cn C,,]E = 7 by [Cn ©] 
x M=[C,, C,,]E* = 7*, then (14.6.3) would be replaced by a statement in which every 
thing else would stay the same except that under c,, p would be replaced by и, X would 
be replaced by M'(ux p)X(p x1), S would be replaced by M'(ux p) S(pxp) M(p xu) 
and all non-null a (1x p) would be replaced by all non-null a*'(1x ww). Similarly, 
in (14.6.6), in addition, S* would be replaced by M'(uxp)S*(pxp)M(pxu) and 
(8р) would be replaced by 7*(s хи) = (в x p)M(p Xu), and similarly for 7’. 


With a confidence coefficient > 1—a, (14.6.6) will now be replaced by the 
confidence statement 


eb (M/S*M)—[ac,(u,8,n—r)}¥e3 (МБМ) < с), (о C45 45)701)77*] 


< cb, OF[8* M) J-[2e;(w,5,n —7)]: 3, (M [SM ). (1461) 


This follows from a modified form of (14.6.3) obtained by replacing а*(р 1) of (14.6.3) 
by a*(wx 1) and introducing other modifications just mentioned. Ifnow we cut out 
the i-th element of a* and the corresponding row of M’ and 7*’ and reason in the 
same manner we should have (for i = 1,2, ...,u) u truncated confidence statements 
like (14.6.7). Likewise cutting out any j-th element of b and the corresponding row 
of c, and column of 7*’ and reasoning in the same manner as before we should have 
(for = 1,2, ..., 8) в truncated confidence statements like (14.6.7). Next, cutting 


u 


Y statements like (14.6.7), and cutting out 


out i, (ù Æ i' = 1,2, ...,u) we have ( 


34$ AG’ = 1,2, ...,8) we have($) statements like (14.6.7), and so on. Thus we 


have altogether 2"—1 statements (based on truncation on u) and 2‘—1 statements 
based on truncation on s) and, by combination, (2"—1) х (2° — 1) statements of which 
the leading one is (14.6.7), all with a joint confidence coefficient > 1—a, which, in a 
sense, provides a complete analysis of the problem. 

14.7. Confidence bounds on departures from a particular kind of multicollinearity 
of means. Work (p--q)-variate N(%,,=) (with k>p-+q), where X((p+q) x (p4-q)) is 
symmetric p.d. with submatrices Ej (p X p), Loo(qXq) and Ej(px4) and &,((p+-g)x 1) 


| 
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has column subveetors £j(p»1) and Ё; (9х1), let us consider the hypothesis Ho: 
that X,, Уу can be 


set of q variates. The hypothesis Не can thus be stated otherwise as the hypothesis 
that the matrix of means of the first р variates, viz, (£1 £s ... £y) is equal to the matrix 
of means of the remaining q variates, premultiplied by the regression matrix of the 


p variates on the y variates. We are now interested in setting confidence bounds on 
EX Xn n (i = 1,2, ..., k) van Ката) 


which, naturally, are departures from He. More properly speaking, we shall be in- 
terested in setting simultaneous confidence bounds on arbitrary bilinear compounds 
а'(1хр) (рх k)b(kx 1), where J is a (рх КЁ) matrix with column vectors given by 
(14.7.1). 

Now taking the ‘residuals’ of the first p variates with respect to the remain- 
ing q variates after the manner of (A.3.17) it is easy to check that for the i-th popu- 
lation the residuals will be distributed as a p-variate normal with a covariance matrix 
Xapi апа about the mean vector £,—X,5X5j Ey (with i= 1,2,..., p). 
Also the ‘within’ covariance matrix of the ‘residuals’, pooled from Ё samples of size, 


say n each, will be given by 
S-a = Sy — 81,99 S15; x (Ид) 


where S(pxp), Sa(qXq) and S,4(pxq) stand for the submatrices of the ‘within’ 
covariance matrix (of the p--q variates) pooled from the k samples. The mean vec- 
tor for the i-th sample will be given by 


Xii — S13 Sad Хә, with i = 1,2, ..., k. es (14.7.3) 
Let B(p x k) stand for the (px k) matrix with the k column vectors given by (14.7.3). 


Thus exactly as in section (14.6) we have with a confidence coefficient, say 
1—2, the following set of simultaneous confidence bounds (for all arbitrary non- 
null a'(1x p) and unit length b(kx 1)): 
a’ Bb—[k(a^S,.,a)e, (p. k, nk—k)]! < a'fb < a'Bb--[k(a'S,.4a)e«(p, k, nk—k)}}, 
(14.7.4) 


where c (p, k, nk—k) is the upper a-point of the distribution of the (central) largest 
characteristic root based on p, k and n—k degrees of freedom. The test for the asso- 
ciated hypothesis А, is also easily obtained, the critical region being given by 


€, > €, (p, k, nk—k), es (14.7.5) 


where c, is the ‘largest root of d (BB')S;l Notice that # and В are each a (px k) 
matrix with k column vectors given respectively by (14.7.1) and (14.7.3). 


14.8. Confidence bounds on departures from another kind of multicollinearity of 
means. It seems that when the population covariance matrix X is not supposed to 
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be known there are two kinds of multicollinearity of means (and departures from it) 
which can be properly handled, namely, (i) that the matrix of means of the: first p 
variates is a constant matria times, the matrix of means of the remaining q variates, 
the constant matrix factor being equal to the regression matrix of the p-set on the 
q-set, whatever this regression matrix might be and (ii) that the matrix of means of 
the first p variates is a constant (and given) matrix times the matrix of means of the 
remaining у variates. Case (i) is the one discussed in section 14.7 while case (ii) belongs 
to linear hypothesis in multivariate analysis of variance of means and has already been 
discussed in section 14.6. 


14.9. Confidence bounds connected with the dispersion matrix of a multivariate 
normal distribution. Let us start from a Y(pxn): N*(0, X) where X(pxp) is 
supposed to be p.d. (so that its characteristic roots are all positive). For simplicity 
we also assume that p < n, so that, a.e., УУ”, that is, nS is p.d., and hence all its 
characteristic roots are positive. We now recall the well known result (A.3.3.) that 
there exists an orthogonal Г(рх p) such that X(px p) = (px p)D, (px p) V'(px) 
where the y’s are the characteristic roots of X. If the roots are distinct then by 
a convention, say by taking all the elements of the first row of Г to be positive, 
the transformation could be made one-to-one. However, we do not need this for 
our present purpose. Note that the number of independent elements on both sides 
is the same. Except for the factor (—}) the argument under the exponential in the 
probability density of Y can now be written, if we put A = y-!, as 


(TD, T')HY Y' = tr PD, D, ГҮҮ = tr (D, I Y((D,I"Yy. 


If we put Z = D,I” Y, it is easy to check that the probability density of Z is 
эп 
[27]? exp [-4 tr 22 | 2. (14.9.1) 
For all non-null nonstochastie a(p х 1) consider now the simultaneous statement that 


gi < a'ZZ'aja'a < gh or g? < a'(D, Г'ҮҮ'Гр,)ај/а'а < g. А T402) 


This statement, for a given Z and g? and g3, is precisely equivalent to the statement 
that 


or that й<с<о,<ф, s. (14.9.8) 


where c, and c, are the smallest and largest characteristic roots of the matrix ZZ’, 
both, a.e.. positive. The relevant distribution on the null hypothesis, i.e., when 
the true population. matrix is У, is known and we now put 


fi = ep, n) and g3 = cs, (p, n), es (14.9.4) 
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where c, (p,n) and cs, (p.n) are constants taken over from (6.4.3). If we now tie up 
(14.9.2), (14.9.3) and (14.9.4) we have, with a confidence coefficient 1—a, the set of 
multiple or simultaneous confidence interval statements for all non-null a and all 
permissible values of the unknown parameters Г and A: н 


a’ac,, (p,n) < а(), Г'ҮУ'ГР.)а < аас (р, n), vs (14.9.5) 
or, remembering that nS = YY’, 
а'ас (p, п) < ар, T'n SPD,)a < a’acy, (р, n). 


The shortness of the confidence bounds (14.9.5) is tied up with the power of the test 
(6.4.3), which has been already discussed in Chapters 9-11. 


Far more meaningful confidence bounds than (14.9.5) can be obtained in the 
following way, starting from (14.9.5). As before denoting the characteristic roots 
of a (square) matrix M by c(.M) and the largest and the smallest roots of M(if M is at 
least p.s.d.) by Cmas( M) and cmm( M), and remembering that A = y~ and finally using 
(A.2.5) we can rewrite (14.9.5) as 


дерт) < all Dy T STDy i5) < „ б (p.m). es (14.9.6) 
Now using (A.1.18) we note that 


(Dy; xl SED; yz) = (STD, Г) = ASE”), s. (14.9.7) 
and obtain with a confidence coefficient 1—g, the confidence bounds 
la (Ps n) < all c(SE-!) =, (р, n), or тс! (p.m) > all (E83) > тез Цр, т). (14.9.8) 


We now recall (A.1.25) and deduce from it that 
Cmin(B)emin( AB) < all c(A) < Cmax (B™)emax (АВ). ss (14.9.9) 
By using (14.9.9) it is easy to see that the statement (14.9.8)—9the following 
NETEP, N)Cmax (S) > alle(E) > nes) (p, ®)с (S). . (14.9.10) 
We now use the following result of set-theoretic logie, namely that 
“Tf #,, then E," & “И, is a necessary condition for E,” & Е, С Е, 
= > PU, < Р(Ё,), 2s (14.9.11) 


to observe that if the probability of (14.9.8) is l—a, the probability of (14.9.10) 
is >1—a, Thus (14.9.10) is set of simultaneous confidence bounds with probab- 
ility > 1—a. 


Also using (A.1.21) we observe that (14.9.8) = the following 


[neid(p, n)} tr(S) > tr, (У) > [лез (p, n) tr(S), (= 1,2,..., p) ... (14.9.12, 
14 
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which, by using (14.9.11), is thus another set of simultaneous confidence bounds on 


c(Xy's. 


The derivation of the confidence bounds (14.9.10) can be simplified if we start from the 
canonical p.d.f. 


Const exp [—4 tr Dıy YY’), .. (14.9.13) 


and recall that (i) S = YY'/m, (i)e(YY') —c(AYY'A') for any | A and (iii) 
tr D, УУ = tr Dj; УУР, ;5, so that (Dy 5 Y Y'D,j jays or ср; nS)ys are 
distributed as c(S)'s when y’s = 1, which distribution has been already used in the 
above derivation. However, the lengthier derivation is instructive in certain ways. 


Going back to (14.9.8) and using Chapter A.2, we note that the formula 
(14.9.8) — 


тсїа(р n) > 2 > теза (р, n), 


or nop, п)а'8а > a'Ea > negl (p, n)a’Sa, 2. (14.9.14) 


which is, therefore, a set o1 simultaneous confidence bounds on a'/Xa for all arbitrary 
non-null a'(1 x p), and with a confidence coefficient 1—2. 


If we start from (14.9.14) and choose a so as to maximize a’Na, then it is easy 
to check that (14.9.14) will imply that mc; (p.m) Cmax(S) > Cmax (2); also if we 
choose a so as to maximize a'Sa. then (14.9.14) will be seen to imply that 
Cmax (X) > пеѕд(р. т)с (S). Similarly for the св. Thus (14.9.14) will imply 


hey Cp. n) Cmax(S) > Cm (E) > negl (p. n) Cmax(S) s (14.9.15) 


and псїа(р,%) Cmi(S) > Cmin (E) > negl (p, n) eis (S), 


which, therefore, is a pair of confidence statements with a joint confidence coefficient 


> 1—2. Incidentally, we notice that (14.9.15) implies (14.9.10) and thus provides 
another derivation of (14.9.10). 


Furthermore, there is a lot more to (14.9.14) than just (14.9.15). Since (14.9.14) 
is supposed to be true for all non-null a(p x 1), we can specialize by putting one, two 
or more components of a(px 1) equal to zero and then we can take arbitrary values 
for the other components. Now let us use the same argument as before, and denote 
by S, X? the truncated (р—1)х(р—1) sample and population dispersion matrices 
formed by cutting out the i-th variate, by 5% and X the truncated (p—2) x(p—2) 
sample and population dispersion matrices formed by cutting out the i-th and j-th 
variates (i Aj = 1. 2,... p) and so оп. Then it is easy to check that (14.9.14) really 
implies (14.9.15) together with statements 


те (D, N) Cou (9) 2 eg (X09) > nesd(p, т) eua (S00)... (14.9.16) 
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and 
nc; Xp, n) ea (S9) > egi (29) S пездр, N) Cmin(S); 


fona 9123 p: 


nci Xp; N) Cmax SP) 2 epa (E69) > neza (P, N) Cmax) 
and nei Xp, n) Cmn SEP) 22 eau (290) > negalp, N) mial 86); 


for izj—L2,..p 


and so on down to the stage of cutting out any p— 1, i.e., retaining any one variate. 
All these statements, 2 —1 in number have a joint confidence coefficient > 1—@ and 
provide one type of complete attack on what the psychologists call the problem of 
latent structure. 

14.10. Confidence bounds on the characteristic roots of Z, Ў5!. Let us start 
from Y,(pxn,): N*(0, Ep) ( = 1,2), where we assume that p < ny, n; and 24 
and X, are both p.d. so that the characteristic roots of X, Xz" are all positive and 
those of Y, Y’ (YY), ie, of (n/n) S1 S3 1 аге, а.е., all positive. We recall 
the results of Chapters A.4 and A.7 and start, without any loss of generality, from the 
canonical probability density (in terms of transformed variates Yj, Y3 ). 


Const exp [—} tr (Dj; Y3 Y3 +Y2Y3)] 
= Const exp [—} tr (Dy yy Y1Y1 Di уу ҮЗҮ). es (4101) 


It is to be noticed that c[(n,/mg)(S,Sg!]s i.e., (ҮҮ) Ү,У;)-1рв are the same as 
e[( Y7Y1( Y2Y2)"!]s and that ув are c(Z, Eg !)'s. 

If we now put 2; = Dj jv Y; and Z, = 5, it is easy to check that the probabi- 
lity density of Z, and Z, is (2л) —201+®2)/2 exp [—} tr (Z,244- Z523)]- 


For all non-null non-stochastie a(p x 1) consider the set of statements 
gi € а'2,аја'2,2а < 0, 


ог g? < a'(Dy „Уу У) ala’ Y, Y ya < 0, 
ог = gi < а'(Бу yy 81D) уу)а[а'8;а < E gi. 2. (14.10.2) 


For given 11, Zə, gł and g3, this set of statements is precisely equivalent to the state- 
ment that 


aZ < sup, 22а < g3 or gt < e со < gh, (410.3) 
a' Z Za ° 


2° =< H 
gi < inf, a'Z,Z;a 
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where c, and c, are the smallest and the largest characteristic roots of (Z,Z;)(Z5Z:) ^^, 
both, a.e., positive. The relevant distribution on the null hypothesis, i.e., when 
X, = У, is known from Chapter 8 and we now put 


gi = Cp, T3, Na) 
and # = Coal P, т, тз), s. (14.10,4) 


where c,,(p, Ni na) and Coal P, Ni, ng) are constants taken over from (6.4.6). 


. Changing back to Sı, S, and y’s and putting c,, and Cz, for с, (p, m, No) 
and ca,(p, 14, na) for shortness, we can now rewrite the second form іп (14.10.3) as 


™ ort > all «(8р 5811 jz) >a esl. s.s (14.10.5) 
А 2 


Now noting from (A.1.12) that D 812) у and of course Sj! and S, are 
symmetric p.d. and using (A.1.25) we have 


Cmax( S152") Cmax( SaD ASDD т) >... 
all 8,0 58.” 5) > Cmin(Sy82")emin( SD 7STD у). sss (14.10.6) 
Also using (A.3.9) and putting S, = 7/7" and then using (A.1.24) we should have 
Caax( SiD FST D7) = Cal ŽE Р 50" ATD 5) e. (14.10.7) 


= Osa (I^ Dj TaD Й) > all (I^ D 173), that is > all (D 5), that 
is, > all y;'s. 
Likewise we have 
Cas ( S.D 58; D yz) < all уг. s. (14.10.8) 


Now combining (14.10.6)-(14.10.8), we observe that (14.10.5) => 


ny - 
mL cid (P, MM) сь (81821) > all ув = all (2,234) 


E mae (р, ту, Na) Cmin( S1831) es (14.10.9) 


which, therefore, by using (14.9.11), is easily seen to be a set of simultaneous confi- 
dence bounds with a joint confidence coefficient, say 1—2 > 1—0. 
Now using (A.1.22) we have 
Cnas( S127) < €na(81) Cmax(S3*), that is, < Cmax(51)/Cmin( S2) and 
Cmin( 8182") > Cmin(Sy)/Cmax(S2)- 2. (14.10.10) 
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Thus (14.10.9) => 


nins (р, Ni» t3) Cmax (81)/Cmin( S2) > all e(2,435) 


Ny - 
> n, 8 (P 7135 Na) Cmin(S3)/Cmax (83); ew (14.10.11) 


which, therefore, by using (14.9.11), is a set of simultaneous confidence bounds with 
a joint confidence coefficient 1—d > 1—6 > 1—a. 


Going back to (14.10.9) and using Chapter A.2, we note that (14.10.9) => 


n Cid Ps Ma) Cmax (89887) > Sry > p Gips ть Ma) Cmia( Sy S85") 

(14.10.12) 
which is, therefore, a set of simultaneous confidence bounds on a'2,2/a'25a for all arbi- 
trary non-null a'(1x p) and with a confidence coefficient > 1—0. Notice the essen- 
tial difference between (14.10.9) and (14.10.12). 

Let us now go back to (14.10.2), recall that that statement is supposed to be 
true for all non-null a(p x 1) and specialize by putting one, two or more components of 
a equal to zero and then use the same kind of argument as from (14.10.2) to (14.10.9). 
Also use the same notation as in (14.9.16) for the truncated (p—1)x(p—1), 
(p—2)x(p—2),..., sample and population dispersion matrices obtained by cutting 
out any i-th variate (with i = 1, 2, ..., p) any i-th and j-th variates (with i 75 j = 1, 
2, ..., p) and so оп. Then (14.10.2) will not only imply (14.10.9) but also statements, 


ACmax(9 SY) > all (DBO) > Ages (S989), ... (14.10.13) 
fort ti 1525 aD; 


2 Алс (ВБЎ ВЭ? > all (20226373) > Aeon SPSL?) 
Toro А 152,540; 


and so on, with a joint confidence coefficient > 1—0. The total number of such 
statements will be 2?—1. Неге A, = M 1 (p, 14,5) and Ay =" Ca (p, ту, Tg). 
fig 2 


14.11. Confidence bounds om regression like parameters. 

(i) Some preliminary observations. We now start with a random sample 
of size n (> p+q; p < q) from a (p+-q)—variate normal population, and next reduce 
for the means and set 


(n—1) [ Su Sts p Y, aes 
> = [Yi З Y3], 
Siz Saz q Y, Due. 


ФУДУ 
where Si So, and S,, stand respectively for the sample dispersion sub-matrices of 


the p-set, the g-set and that between the p-set and the g-set and where Y, and Y, 
have the p.d.f. 


Za XQ] Y, 
Const exp E [YT YS eer (АЛЕТ) 
Ei Ум Y, 
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For reasons which will become clear later we shall take f = Xj, Ўз and B = 88гә 
respectively as the population and the sample regression matrix of the p-set on the 
q-set. Suppose we now consider two new variate sets x,(pX 1)—/(p ха) x«(q X 1) 
and x,(qX1). Then the population covariance matrix (X,—fX2,X,) = 0, so 
that e( S, — Sis — P815 - P Sas") (Sa — P Sas) Sd (S12— Sas )] are distributed as 
e[S:18,,8:]81,]s when f = 0, i.e. when X, = 0. 


(i) Confidence bounds on the regression matrix f. Consider the statement 
all cs < g? or c, < 0, s. (1411.3) 


where c/s (i = 1,2,...,5; 0 < с, <... & c < 1) are the roots of the determinantal 


equation in с: 
|e( S3, — Srk — B1 - Pas") — (8—8) S2 (S1o— Saa )| 0. ... (14.11.3) 
Now put e = c[(1—c), so that we have то (14.11.3), the determinantal equation in e, 
leS S, S33 810) — (Ss Sad —A)San( Sf Si— = 0. ... (14.114) 


Notice that the statement (14.11.2) can now bereplaced by the statement that the largest 
characteristic root e, < g?/(1—g?), i.e., 


all. 05, —512881)-08—4)8(В'—0)] < (Lg), ... (14.11.5) 


where B(px4) = 8,853, 2. (14.11.6) 


which may be appropriately called the matrix of sample regression of the p-set on the 
q-set. 


We note that (14.11.5) = (14.11.2), and that c, has the distribution of the 
largest characteristic root of the matrix 8118,95151, when X,,— 0. The joint 
distribution of these central roots and also of the largest root being known, all that 
we have to do to make (14.11.5), i.e., (14.11.2), a simultaneous confidence statement 
with a joint coefficient 1—a is to choose g? = ¢,(p, q, n—1) where the quantity on the 
right side is defined by 


P [central е, > ¢,(p,q,n—1)] = a. =ч GLE LAT) 


Substituting now e,(p, q, 2—1) (to be sometimes denoted more simply by с) 
for g? in (14.11.5), we have a simultaneous confidence statement with a joint confidence 
coefficient 1—0. 


Now applying (A.1.18) and (A.1.22) (in the same manner as in the previous 
sections), we have from (14.11.5), now with a joint confidence coefficient > 1—«, the 
following simultaneous confidence statement 


ай «(В—#В—#)] < g$ Cmax Su SiS S12) mal Sad). «++ (14.118) 
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Using (A.1.22) again, we check that (14.11.8) can be replaced by the following 
wider bounds (with a confidence coefficient > 1—2): 


all ci B— BB — 8] S (e«[1— 6.) 1 —Cmin( Si 812852 812) lema (11)6 max (52 ). es (0141.9) 


We next recall the following two well-known results (A.2.5) and (A.2.7) which we re- 
member for convenience as 


all c(M) <q (fora pxp real matrix M with real roots) => 

di1xp) M(pxp) d,((px1) (for all arbitrary unit vectors d) <g ... (14.11.10) 
and X (Lxg)x(gx1) < М > 0) < |х(1х9) digx1)| & V/A... (14.11.11) 
(for all arbitrary unit vectors d). 


Applying (14.11.10) and (14.11.11) to (14.11.8) we have (with a joint confidence 
coefficient > 1—a) the following simultaneous confidence statement (for all arbitrary 
unit vectors d,(px1) and d,(q~ 1), 


d\(B—f)d, < [right side of (14.11.8)] wee (14.11.12) 

or ultimately 
diBd,—4/E < difd, < dj Bds4-4/ E, es (14.11.13) 
where Е = [65r ol émax(S11— 812832513) émax( 852 ). es (14.11.14) 


А set of simultaneous confidence bounds on just the elements //;; of the //-matrix would 
be a subset of the bounds on the total set d;/d,. It is worthwhile to check that if 
р = q = 1(14.11.13) reduces, as it should, to (13.4.4). Also if p = 1, we should have 
another special case of (14.11.13) giving a set of simultaneous confidence bounds on all 
linear functions of the partial regressions of one variate on several others. Thus, in 
several ways, (14.11.13). seems to be an appropriate generalization of (13.4.4). 


As in the derivation of (14.6.6) from (14.6.3) it is easy to check that (14.11.13) 
will imply 


BB')—4/E < cf) < chal BB) HVE, ... (1411141) 


hal 
where Æ is defined by (14.11.14). (14.11.14.1) is thus a confidence statement with a 


confidence coefficient > 1—2. 


Furthermore, if we now go back to (14.11.5) and replace it by 


(14.11.14.2) 


for all non-null a(p x1), then we observe that (14.11.14.2) implies (14.11.13). Now 
we can specialize a(p 1) by putting one. two or more components equal to zero and 
then, in each case, take arbitrary values of the other components and reason as from 
(14.11.5) to (14.11.13). Thus, if, as in the two previous cases. we denote by SẸ, 
SQ, BO, BO, SG». 600, BGI, 62. ete., the truncated matrices obtained by cutting 
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out the i-th variate of the p-set (with i = 1.2, .... р) the i-th and j-th variates of 
the p-set (with i 4) = 1, 2....,p) and so on. it is easy to check that (14.11.5) will 
altogether 27—1 in number, imply not only (14.11.13) but similar statements on the 
truncated matriees as well, all with a simultaneous confidence coefficient > 1—2. 
The same applies to a set of statements on the truncated matrices, similar to the 
statement (14.11.14.1). We also observe that we can extend this still further by 


doing a similar truncation with regard to the variates of the q-set, but this will not 


be discussed in the present monograph. 
For an alternative set of confidence bounds we proceed as follows. Going 


back again to (14.11.14.2) we rewrite it as 


a'(B—f)S,(B—fpya < 


/ (14.11.15) 
a'(8,,— 8,955 5)а 


for all arbitrary non-null a(p 1). 
Now using (A.3.9) and putting Заа = TT" and using (14.11.10)., we have 
(14.11.15) reducing to 


a'(B—pyP T (B—pya < LER a (By Bid Sia) септе) 

~ i ~ 

or a’BTb—{ °—\` [a'(8.— S, 8:251]! < аЬ 
< a Bib. | 2), [a (Su S, S Sia]. ss (14.11.17) 


for all arbitrary unit vectors b(qx 1). Now put T(q xq) b(qx 1) = e(qx 1) say, so 
that (14.11.17) reduces to the following set of simultaneous bounds with a confidence 
coefficient 1—0: 


a'Bc— n ба r | а'(5— 554 81)а | ' < a'fic 


1—cy. 


<a'Be+{, 5] [tou Susi isa] we (14,11,18) 


for all arbitrary non-null a(p x 1) and all c(g x 1)subject to 
1 — b'b = datae = с(ЇЇ'учїс = eSa c. we (14.11.19) 


These confidence bounds are no doubt closer than those of (14.11.13) but these 
seem to be useful from a physical standpoint only when we have, in the eustomary 
sense, a regression problem of a p-set on a q-set such that the p-set is stochastic 
while the q-set is fixed, so that S, (and hence $57) are neither unknown parameters nor 
stochastic variates but just a set of given constants. 


CHAPTER FIFTEEN 


Some Non-Parametric Generalizations of Analysis of 
Variance and Multivariate Analysis* 


15.1. Preliminaries and notation. In this chapter we shall be concerned with 
the statistical analysis of data in the form of observed frequencies in discrete (and 
finite) categories; a typical category being the (ij)-th cell of a lattice, with i = 1, 2, 


+... ту and j =1,2,...,¢and X rj=rs(say). Let n; be the frequency in the (ij)-th cell, 
g=1 

and p,; be the probability of getting an observation in that cell and let us assume that 

the observations are independent (in probability). Also let E Taj = Nos X Ny = Тој 


а, Tu; = Noo = n(say), X Pj = Pio» >р = X ру = Pop X Py = Poo Now let us 
b, j j i ij 


assume that the sampling scheme is such that 7; is fixed from sample to sample and 
po 1, with j = 1,2,...,5. Then the likelihood function (which, in this case, is the 


same as the probability) is given by 


Li » 
oa [ Ty! 1 pi’ | SI 


In most realistic problems, however, 7; = 7 (i.e. independent of j) which is what will 
be assumed in the following discussion, although the possibility of a general form 
(15.1.1) will also be kept in mind, Notice that (15.1.1) or its special case when 7; =”, 
is based on the product of s separate multinomial distributions. Now suppose that 
i is a multiple (here k—ple) subscript i; ig...4, and j also is a multiple (here 1—ple) 
subscript јаја: With iy = 1,2: «5 735 hg = 1,2, s aie = 1, 2,...,7,; and 
Jr€(81)jngo---dv» Da(Sa)ja Лэ fiat dt = 1, 2.5... 8 where (5,)j,..j, i$ a subset of s, 
depending on ј».. јр and so on up to (5.,);. This will be said to be a k-variate 
body of data arranged in / ways of classification and each of the running subscripts 
i, dass ip Will be said to be a ‘variate’ and each of the running subscripts 5v ja: 
will be said to be a ‘way of classification’. As observed before, it may be noted that 
in most realistic problems j will drop out of the range offiy, їз, ... б 6. that +, = 1, 2, 
su) tg = l, 2, os Mas and so on. Also, in one class of problems the range of the 
subscripts j1 ўа» «++» Jı Will be as indicated before while in another class of problems 
the range will be less general, being given by jy = 1, 2, +++ 813 ja = 1, 2, ees 893 
cee Sp Ь 25 ores Se 

In this chapter certain types of composite hypothesis on the p’s will be 
considered, the more general types of composite hypothesis and more general decision 
procedures involving the p’s being reserved for alater monograph. As will be observed 
later, a composite hypothesis, to be physically meaningful, will have a particular slant 


30, 33, 47, 48, 55] in this connection. 
113 


* See references [1, 2, 8, 9, 10, 13, 25, 


15 
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with regard to the ‘variate’ ‘i’ or its components й, js, ..., ip and another slant with 
respect to the ‘way of classification’ ‘j’ its components jj. Jos .... Ji 

In many (though not all) problems to be discussed here either (i) = 1 and 
t= 1 от 25witht,—4 —-1, 25..." andy; — 31,2... 8:007; — 1, 2505, 3, and 
jo = 1,2,...,8 or (ii) k= 20r 3 and | — 1, with i, =1,2,..., 7; tg = 1,2, ns 72 
and (when k= 3) i = 1, 2, ..., 73, and with jı = j = 1 (which subscript will, therefore, 
drop out). Case (i) will be called a univariate problem under one or two ways of classi- 
fication and case (ii) will be called a bivariate or trivariate problem with one popula- 


tion. We shall now discuss some problems under a two-way or three-way frequency | 


table. 


15.2, Problems in a two-way table. То fix our ideas, consider first a two-way, 
say rxs table with observed frequencies n, in the (ij)-th cell (with i = 1, 2, ..., апа 
j=1,2,...,8). Also let Xm; = my, Xm; = ng and X m; = ma = n(say). 

i j ij 


15.2.1. Both ‘Ù and ‘j’ are ‘variates. 


Assume that we have a sample of n independent observations such that p; 
is the probability of an observation in the (7j)-th cell, and x is fixed from sample to 
sample. Also let Xp; = pip, Ур; = Pop X Py = Poo = 1. Then the likelihood 

j i ij х 


function is given by 


! Hn 

v E n pj e. (15.2.1.1) 
NUM. 
„у 


The composite hypothesis [47, 48] we shall be interested in testing is that ‘i’ and ‘j’ are 

independent, that is, that Hy : р, = pioPo; against Н + Hy, where руз and Poj S are 

arbitrary positive nuisance parameters subject to X Pio = X руу = 1. This is the ana- 
i j 


logue of the hypothesis of no correlation in a bivariate normal population. Under 
Hy we shall have the likelihood functions фу given by 


n! nij n! 3 
фо = TUN П (Popy) = ies П poi TL py. vs (15:2,152) 
ij „ ij 


15.22. "i is a ‘way of classification’ and ‘ў is a ‘variate’. 
Assume that we have r independent sets of sizes To; 


Tag; ..., Tj, ОЁ independent 
observations such that n; (i = 1,2,. 


^ 7) is fixed from sample to sample and Фу is 
the probability of an observation in the (ij)-th cell. Also we notice that X Dij — Dio 
=1. Then the likelihood function is given by иы 


= [түш]. 0 ордар 
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The composite hypothesis we shall be interested in testing is that pj, for any j, is indepen- 

dent of “ or in other words, that Ho : р; = Чо(вау against Н 52 Hy, where qos are 

arbitrary positive nuisance parameters subject to X qoj = X pi = Pio = 1. This is 
j j 


the analogue of the hypothesis of the equality of means for r homoscedastie uni- 
variate normal populations. Under H, we shall have 
IL nj! 


ane 


s= n gt na] 


292.06 Moj 999 
In; d П oj". ы. (15.2.2.2) 
j 


This фе could also have been obtained by starting from the ¢ of (15.2.1.1), then putting 
Ho : р; = роро and then finding under He the conditional probability subject to 
Mo's being fixed. But it seems that physically this is far less realistic than the 
model here used, although historically this is more or less what has been done so far. 


The case of ‘i’ being a ‘variate’ and ‘j’ a ‘way of classification’ is exactly similar 
and need not be separately considered. 


15.2.3. Both ‘i’ and *j are ‘ways of classification’. Here we have a sampling 
scheme in which жуз and тув (i = 1, 2, ...,r; j = 1, 2, ..., 8) are supposed to be 
fixed from sample to sample. In this situation, on the hypothesis of independence 
between ‘i’ and ‘j’, we can write down the likelihood function фо without assuming 
that the observations are independent. For this we start from an urn problem model 
in which there is an urn containing njos 250, .... ?, balls of r different colors from 
which we draw successively without replacement) nor Mo; ..., "o, balls (with 2 no 

i 


= Уту = №). The joint probability that the j-th bunch m; will contain nij, najs . 


Ae 


2 
mj balls of different colors (with j = 1, 2, ..., 8) willbe given by 
Po = I jo! П mo;!n! T ny! Sa (15.2.3.1) 
i j ij 


The great advantage of this scheme is that the different observations need 
not be assumed to be independent and the great disadvantage is that we would not 
know how to write down ¢(under a general H as distinct from the null hypothesis 
Н, of independence between ‘i’ and j’). ‘This means that here it is not only that we 
do not have any idea of the power of a test for H, against alternatives but also that 
it would not be possible to obtain a one tailed x? test for Ho by using the same kind 
of heuristic arguments that we shall use for the first two situations. We can use а 
one-tailed y?-test here just by analogy with what we do in the first two cases. 


This ф, could also have been obtained by starting from the of (15.2.1.1), then 
putting Hy: ру = Роро and then finding under Н, the conditional probability 
‘subject to ns and ns being fixed. But, for one thing, this except for some very 
special situations, would be less realistic than the model here used and, for another 
thing, this would deprive (15.2.3.1) of the one great advantage it possesses in that 
the successive observations do not have to be independent. Notice that (15.2.1.1) 
is based on the assumption of the observations being all independent. Gr 
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It will be seen that the apprcach here is not one of conditional probability at 
all. It will also be seen that there are three different sampling schemes each leading 
in a natural way to a particular probability model and a particular type of hypothesis 
to be tested. From a physical standpoint it would not be proper to break this tie and 
use a particular probability model and test a particular type of hypothesis when the 
sampling scheme is something different. Tt will be noticed that in most situations 
of life the natural sampling schemes are those of (i) or (ii) but there are situations, e.g. 
Fisher's tea tasting experiments or those connected with the extra-sensory perception 


experiments or with the claims of astrologers as to prediction, ete. where (iii) might. 


be a natural sampling scheme. 

15.3. Problems in a three-way table. As a natural extension of a two-way 
table consider a three-way rx sx table with observed frequencies т, in the (ijk)-th 
cell (with i=1,2,..,7 fJ=12,.58 Ё=1,2,..., 1). Also let Eggs = Фуу 
vu = Miop У Nijp = Dio: z Dig = Nook = Nig = Nojo; та Nije = Nioo: AT Nooo 

= n(say). 

15.3.1. P P and ‘K all ‘variates’. Assume that we have a sample of n inde- 
pendent observations such that р is the probability of an observation in the (ijk)-th 
cell and n is fixed from sample to sample. Also let = Pix = oj = Pin = ро» Урд 


= Pijo У py = Poor: x Duk = Pojo» x Duk = Pioo E Фук = Pooo = 1. 
ij ik j,k ЬЕ 


The likelihood function will be given by 


«СА ИШЕ ijk 
ET Полу! A Pije ` s. (15.8.1.1) 
ijk 

In this case, as indicated in [47, 48] we shall be interested in testing a class of compo- 

site hypotheses, a typical one being, 
15.8.1a. Hypothesis of conditional independence between “? and *} |2. This 

will be 
H,; Pik. = Pos Dok or p., = PiokPojr 1 

°* Poor Poor Pook fe Фо. A rr 


against H + Н, (fori — qr sores =}; 


This is the analogue of the hypothesis of no partial correlation (between x and 
y)|z in a three-variate normal population. As shown in [47], if we superimpose 
on this the composite hypothesis of independence between * and ‘k’, and between “]' 
and ‘k’, ie., 

Pior = PiooPoos 804 род = Pojo Poor: -. (15.8.1.3) 


(which is the analogue of the hypothesis of no total correlation between (= and 2) 
and (y and z) in a three variate normal population) we should have 


Фук = Pioo Pojo Poo: vee (15.8.1.4) 


which is the condition of complete independence of ‘i’, ‘j’ and ‘hk’. 
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3 Following [47] we shall also be interested in another class of composite 
hypotheses, the typical one being, 
15.3.1b. Hypothesis of independence between “i, jy and ‘k’. . This willbe 
Ho : ру, = Pio Povra gainst H A Hy, (for = a — 1:55 ў 
Dt): s. (15.8.1.5) 


This is the analogue of the hypothesis of no multiple correlation between 
(x, y) and z in a three variate normal population. 
As shown in [47], (15.3.1.5) implies the composite hypotheses. 


рок = Pioo Pook ANd oj, = Pojo Poor: (15.3.1.6) 


But (15.3.1.6) will not imply (15.3.1.5). The extra condition that was needed 
on top of (15.3.1.6) to lead to (15.3.1.5) was shown [47, 48] to be the composite hy- 
pothesis 


— dijo Чок Lojk v (15,941) 


Ho : pij 
ы 2. lioo Tojo Loox 


where gijo ior» Чок Were defined to be arbitrary (positive) functions of (i, j), (i, k) and 
(j, k) and фо, dojo» Yoox arbitrary (positive) functions of i, j and k, with no summation 
convention connecting them as in the case of the pj,s. By analogy with analysis 
of variance this will be called the hypotheses of ‘no interaction’. 

15.8.2. ‘P and 'j are ‘variates’ and ‘K’ a ‘way of classification’. 

Assume that we have t independent sets of sizes nooi .::› oor of independent 
observations such that nog(k = 1, ..., f) is fixed from sample to sample and ру, is the 
probability of an observation in the (ijk)-th cell. Notice that У Ру = Poor = l. 

$3 


The likelihood function will be given by 


Ф | Hoo n „. (05.2.1) 
П П 


Неге we shall be interested in testing, 
15.3.2a. Hypothesis of independence between * and ‘j’ for each ‘k’, i.e., 


Ho : рур = Pior Pow, against Н + Н, (fori = l, o rj = 1,...8; 
LY 5): ..- (15.8.2.2) 


If we superimpose on this the composite hypothesis that the marginal ‘i’ 
(obtained by summing over ‘j’) is independent of ‘k’ and similarly for {j’, і.е., that Pior 18 
a pure function of i! and роу i$ а pure function of ‘7’, i.e., 


Pior = dio S8) and Poje = fojo (say), (15:3:2.3) 


we should have Pije = Чоо dojo- .. (15,8.2.4) 


118 SOME NONPARAMETRIC GENERALIZATIONS 
Notice from (15.3.2.3) that X qi = X Pior = pog, = 1 and also that E фууу = X Pojk 
V ү и i А i 1 5l 


= Por = 1. 
We shall also be interested in the composite (15.3.2b) hypothesis that p; is 
independent of ‘k’, ie., pij is a pure function of ‘(ij)’, i.e., 
Но: p; = dijo (Say) against Н + Hy (for all i,j and k). ... (15.3.2.5) 


This is the anologue of the hypothesis of the equality of ! mean vectors (each 
consisting of 2 components) for ¢ bivariate normal populations, each having the same 
variance-covariance matrix. 


If we sum over “ and ‘i’ separately, this would imply 


Pior = È duo = qf) (вау) and. рор = X gyo = 40 (say). es (15.3,2.6) 
m t 


As in the case where ‘i’, ‘j’ and ‘k’ are all *variates", so also here, (15.3.25) ——91(15.3.2.6) 
but (15.3.2.6) does not imply (15.3.2.5). Exactly in the same way as in [47, 48] it can 
be shown that the extra condition which when superimposed on (15.3.2.6) will ==> 
(15.3.2.5) is 


Pyr = Dio Чо Фор. wee (15.8.2.7) 
9:00 Tojo Loox 
15.3.3. Č is a ‘variate’ and ‘P and ‘k’ are ‘ways of classification’. 
Assume that we have sxt independent sets of sizes ту. of independent obser- 
vations such that no(j = 1,...,8; k = 1, ..., t) is fixed from sample to sample. and 
Piris the probability of an observation in the (ijk)-th cell. Notice that Жр, = po — 1. 


The likelihood function will be given by 


ф= П Е пру. = (16.3.3.1) 


jk In! ; ut 
* Tn $ 


Here we shall be interested in the composite 15.3.3a hypothesis that for any 
‘Te’, Diy, із independent of 5j, i.e., that р, is a pure function of i! and ‘k’, i.e., 
EJ 
Но: Pir = dior (вау) against H 52 Н, (for all i, j and Ё). ve (15,3,3.2) 
Notice that E qio = È Pir = py; = 1. 
i E У 


We shall be also interested in the other composite (15.3.3) hypothesis that for 
Pix їз independent of “k”, i.e., that p; is а pure function of i and udis 


[D 


ату “j 
Ho : Pix qiq(say), against Н 52 Н, (for all i,j and k). r.. © (15.3:8.8) 


Notice that X 40 = X Dir = Ров = 1. We now observe that (15.3.3.2)-- 
piste ey 7 E Se ee UT 
(15.3.3.3) implies that p; is a pure function of ^; i.e., that ~ 


(A8, E EI) o Pok = Goo (Say), for all i, ў and. &. tous 7(15.3,3:4) 
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Tf, in a one-way classifieation inthe usual analysis of variance, ‘i corresponds 
to the ‘variate’, ‘f’ to the so-called ‘concomitant variate’ and ‘hk’ to the ‘way of classi- 
fication’, then it will be seen on a little reflection that (15.3.3.1) will be the analogue 
of the hypothesis of no regression and (15.3.3.2) will be the analogue of the hypothesis 
of no covariance. On the other hand, suppose we take ‘j’ and ‘hk’ as just two ‘ways of 
classification’, for example, we take ‘j’ as, say, blocks and ‘kh’ as, say, treatments in 
a randomized block experiment (with more than one and in general unequal number 
of replications in each cell). Then (15.3.3.1) will be the analogue of ‘no block effect’ 
for each treatment separately and (15.3.3.2) will be the analogue of ‘no treatment effect’ 
for each block separately. In other words, in the usual parlance of analysis of variance 
(15.3.3.1) lumps together one ‘no main effect’ and ‘no interaction’, while (15.3.3.2) 
lumps together another ‘no main effect’ and ‘no interaction’. 


15.8.4. ‘i’ is а variate and ‘ĵ’ and ‘k are ‘ways of classification’ in the sense of a 
‘balanced incomplete’ or ‘partially balanced incomplete’ or a more general type of ‘incom- . 
plete’ block experiment. 


Assume as before that there are r ‘i’’s, s ‘j’s and ¢ ‘k’’s. Assume further 
thatj'isablock and ‘k’ a treatment and that, for any ‘j’, there is a set of treatments 
(1), to go with it, of number ў. In other words, for a given j, k takes on the set of values 
(t); where (1); is а set of indices of number ¢; out of 1, 2,..., t. Now assume that we 


8 
have X f; independent sets of sizes мод of independent observations such that noy 
ј=1 
(k e(t}; j = 1,2, ...,8) is fixed from sample to sample and р, is ће probability of 
an observation in the (ijk)-th cell. As before У pj = оь = 1. The likelihood 


function will be given by 


g Жюз! Tijk 
Eu. t VE TT de ; we (15.3.4.1) 
? j=1 ke(t [п Wy. d za 


We can take over the hypothesis (15.3.3.2) of ‘no block effect for each treatment 
separately’ and (15.3.3.3) of ‘no treatment effect for each block separately’. For a 
‘balanced incomplete design ° all //s will be equal aud there will be a highly symme- 
trical pattern while for a ‘partially balanced design ” all s will be equal and there will 
be a less symmetrical pattern. 


15.8.5. P is a ‘way of classification’ and ‘j, k also are ‘ways of classification’ 
in the sense that nios and тув are fixed from sample to sample. Following case (iii) 
of section 2, we can write down фо in this case (exactly the same way as we wrote, 
down the d, in that case) on the hypothesis of independence between ‘i’ and ‘(j, i)’. 
This will be 


uL 


фе = Uno! II Tig! n! Lngd ... (15.3.5.1) 
i dy f фе 


( 


Starting from thi we can test the hypothesis of independence between ‘i’ and ‘f’ ‘hk’. 
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The case of ‘i’, ў and ‘k’ being ‘ways of classification’ in the sense of noo S, 
yy S and у 8 being fixed from sample to sample (but not mos) is also of some 
interest. But we shall not consider that case in the present paper. 

The extension of the problems of the two-way tables of sections 2 to those 
of the three-way tables of section 3 is a rather big conceptual jump, but the extension 
from three-way tables to those of higher dimensions involves no such jump and will 
not be discussed in this chapter except for some remarks toward the end. 

15.4, The derivation of the x?-test by the union-intersection principle. 

Let a random sample of size n from some population be classified into k( < n) 
mutually exclusive and exhaustive categories according to some observable charac- 
teristics (qualitative or quantitative) and let the probability of a random observation 


k 
falling in the i-th category be p; with p, > 0 and X p, = 1. Let m, denote the ob- 
il 
Also let 


served frequency in the i-th category with of course X m, — m. 


i 
n'(lxk) = n’ = (my, ns, ..., nj) and p’(1Xk) = p' = (Di; Po р): We have now 


ni 
Pin'|p'}= рт Пр. es (15.4.1) 


15.4.1. А simple hypothesis Z, : p’ = р, against the composite alternative 
H:p'zp. 
Consider first the most powerful test at a level say 2; of Hy: p’ = po against 
a specific p; + po, which, by the Neyman-Pearson lemma, will be as follows: 
reject Ho if 
P [n' |p]]/PIn' | p] > л, 2s (2541.1) 
and don't reject Н, otherwise, where, given у, the size of the critical region (15.4.1.1) 


under p; should be A(z, pj. pi, т). Substituting in (15.4.1.1) from (15.4) and taking 
logarithms on both sides we see after a little simplification that (15.4.1.1) => 


(15.4.1.2) 


that is, > c(pj, Jp, n) вау, 


muss a'(p)—llog (P11/Pr0); log (Dox/pao).--.. log (Prr/Pro)]; and Л = (т) is the variance- 
covariance matrix of n,, 15...., n, under Ho : p/— p, and where р; is supposed to vary 
with, that is. depend проп ру. It is thus evident that, for a fixed c, the critical region 


w(pi, с) = [а': 2 (evin—npo) ў 
(pi, c { MV STORE > 4. ve (05.4.1,3) 


is the most powerful critical region for testing p' = p; against a specific p' = p;(+ Po) 
at a level of significance (р, c, n). Since the composite H: p’ 5 po is the union 
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of all H; : p’ = р; (+ ро) we use the union intersection principle [45] and take for Но, 
against the composite H, the critical region 
wo) = U w(p,o) 2. (15.4.1.4) 
PŽP, : 


Thus we should have 


Complement w(c) = {n: sup a'(p[n— npo] < 4. ..  (15.4.1.6) 
p'#p Ма(фр)л ) ^^a(p) 


Since X n; = n and X pio = 1, we can write 
i i 


a'(pin—upi] _ Брат ni). > 
Va‘(p) A°a(P) УБ БЛЕР) ` К 


where 


= (My, Nas o> у), Ру = (Pio: Pao: +++ Dias): 
b/(p) = (b(p); ba(P): ++» 02-100)), 


bp) = ap) — a,(p) = log (DaPkol Pao) (for 1l. k—1); 


and AQ, is the matrix formed by cutting out the k-th row and the k-th column of 
lue on the real line and conversely, 


A9. Notice that each b(p) can assume any va 


given any real vector by = (Pio - ., Dyas o); the equations: b'(p)= bo, have always 


a unique solution in р, е.5. 


bio 
pipe = (Piol Proe = Ajo: вау 


k-1 z 
or p; = А/(1+ bi AQ) (for i = 1, 2, k—1) and p = ШЕ = Аф). 


Hence we have [25] 


RE )n—" pol. = su b'[n*—npo] == + [a[n*—npo]’ ee LO 
eae I == ips] Ate [n*—"po]| - 
ма" (p) ^"a(p) v Vb Alb | a 5 


We next observe that 
Tij = —MPioPio ifi +j ande; = npa(d—Pio)- 
0-1 — (xu), then оу = limp, if 3 +j and 


It is easy to check that if Are 
Oy = 1/npiot Шо: We have, therefore 


sup a ‘(p)[n— npo] — У („паў ч. 
PAP’, Ма'(р) л%а(р) 1 "Dio 


16 


122 SOME NONPARAMETRIC GENERALIZATIONS . 


Going back to (15.4.1.5) and thence to (15.4.1.4) we now notice that (15.4.1.4) reduces 
to 


(с) =[а: m ES mdi > c}. E ЫГЫ) 


Since the left side of the inequality in (15.4.1.7) is essentially non-negative it iseasy to 
see that we obtain a non-trivial solution only when с > 0. It is thus seen that the 
x*-critical region is obtained by using the union-intersection with respect to variation 
over the alternatives p'(-4p,), keeping fixeda quantity c defined (in terms ofp; and ñ) 
by the right side of (15.4.1.2). This means of course letting P vary with p in an 
appropriate manner. Now if » is large, we go back to (15.4.1.3) and observe from 
the asymptotic normality of the left side of the inequality (15.4.1.2) that, as n — оо, 


LJ 
Pipic. n)> z i fet at ma (15.4.1.8) 
vm | 


Tn large samples it is thus seen that keeping c fixed means making / the same for all 
ру'в, which means that in large samples the y? critical region (15.4.1.7) comes out as 
a union-intersection critical region of type I [43]. For large n,’s (the approxi- 
mation would be good enough even for moderately large values of n,’s) it is well known 
that the left side of the inequality in (15.4.1.7) is asymptotically distributed as а y? 
with d.f. (k—1). Fora satisfactory proof see [10]. 


15.4.2. Test of a certain type of composite hypothesis on p's against a certain type 
of composite alternative. 


Suppose that the composite hypothesis is given by 


Hy: (p, = p(y, Oo, .... GM} 0615... өдга? 


where р,(0,,..., 0,) are k known functions of 7(<k) unknown parameters. The 
hypothesis does not specify the values of the parameters except that they belong to 
a certain parametric space о. The (composite) alternative is H+H, For апу 
Specific (07, 03, ..., 09), we obtain, as in the previous section, a heuristic test of the 
hypothesis H, : (p, = p (09, ..., 02) against H, 52 Н, which has a critical region 


k in ) оуу 
с, 00. ...,00) = fn: 5 (0994000, ..., 0)? > 2 
90, 01, nl) { ot ABR S @у ae iE 


(15.4.2.1) 


This critical region is the region of rejection of 


Hy {pi = p(09, ..., 0?)| for a specific (60, ..., 09). 
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Now to reject Ho : (p, = p(0, .... Oion .... P would be to reject 


Hy : pi = р(00,..., 00) for every (09, ..., 02)є0 
and thus, using the union-intersection principle for the second time, we have for Hy 
the critical region. 


we; Hy) = ff) we, 01, .... Ө) 
(01, ...» Orden 


200 E (тпр, ...,0,))® 
= Јаз Inf У (m—npi£i, ++)" Soa 2. (15.4.2.2 
| (ө. „едеп int NPAO es 00) | s ) 


which is precisely the minimum y?-critical region. The equations giving 6/'s in terms 
of тув, in the form, say £s for minimum x? are 


9. а (пб... 0 0 (forj 1,2... 0). (15.4.2.8) 


00, Y пр, .... 4) 


It has been shown [10] that for large n;’s the equation (15.4.23) can be replaced by 
the maximum likelihood equation (so much easier to work with), the likelihood fune- 
tion being 


Пл 


qn! 
ф = т n pi (Oy, -.., 0). s. (154424) 


The maximum likelihood equations can be put in the form 


0 * э др k n,—np, Op, ү; 
0m I] -3- бе Xu BN OSE(jll2,.«r.e (1542.5 
00, ов Фо inp; 00, г np, 00, J " ( ) 

It has been proved (and will be published in а later monograph) that if we 
start from the more general probability model of (15.1.1), pose а composite hypothesis 
problem of the type of section 15.4.2, and then use the union-intersection principle 
in the sense of sections 15.4.1 and 15.4.2, we obtain the corresponding \?-critical region 
with a structure which is just stated in the following, sections. 

15.5. Some useful theorems on x°. 

We state here several theorems the first two of which are well known, Hor a 
careful statement and satisfactory proof see, e.g., [10]. The remaining theorems 
have been proved, and the proofs will be offered in later papers. 

Theorem I. Tf we start out from a P(n/p^) or Po of the form 


r ГА 
oo — nt П pn | П n! (under E n; = n and E p; = | and p; > 0). ав n> 00, 
del tol * i 


the sampling distribution (under фо) of i (n,—np?)*/np? tends to the X?-distribution 


with d.f. r—1. 
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This is used for testing the simple hypothesis Ho : p = p? against the composite 


Le i . А 
alternative H : p + p? under the general model: ¢ =n! П p. | П ж! associated 
i=1 i 


with a sampling scheme in which only т is fixed from sample to sample but not any 


of the n,’s. 
+ А 
Theorem II. Let P(n|H,) or ¢ be of the form фу = n! Ir pri (01, ...› A) 


fi n,!, where Ein; = n and where pj(f. .... 8), :.. py (05. ...,0,) (5 < т) are т funetions 
i 7 
of the s parameters such that, for all points of a non-degenerate interval A in the 

s-dimensional space of the 0/=, the p;s satisfy the following conditions: 


(a) X pif. .... 0,) = 1, (b) pA, «+s 0) > c? > 0 for all i, (с) every pi has conti- 
ii 


ivatives 02: ш -xs matrix [ 22. \ is of rank в. Then 
nuous derivatives 20, апа 086,00, and (d) the rx s ma D, | is of rank 5. ; 


assuming that A is so defined that the true population parameter point (07, ..., 09) 
(= 0°, ѕау) is an interior point of A, (i) there exists one and only one solution 
(0,....,0,)( = 6", say) of the equation (15.4.2.5) such that 6’ — 0” in probability 
as n— oo. (ii) Furthermore, this 6’ has the property that, as n — со, the sampling 


distribution of € (nj—np(8,. .... 6,))*/np(6;, .... 8) tends to the y*-distribution with 
i=1 


with d.f. (r—1)—s. The probability measure dy, under which (i) and (ii) hold is one 
which we obtained by sticking into фо the true population point 9, 


This is used to test the composite hypothesis. Ну: р; = pif. .... 0.) 
(i = 1, ..., r) against the composite alternative H + Hy under the general model of 
theorem I associated with the same sampling scheme. Under H,(and H, alone) will 
ф be of the form ¢, of this theorem. 


Theorem III. With the same фо as in theorem II and with р(03,...,0,)'5 being 
defined as functions of (0,, ...,,) subject to the same conditions as in theorem II, 
suppose that we have furthermore, under the null hypothesis Ho: f,(9,,..., 9.) = ff. 
where /0в are fixed and Ё: 1,2, ..., t < 8, such that over A, (e) each f, has continuous 
derivatives E and 9735 and (5 the £x s matrix e is of rank /. Next, let us 
write down likelihood equations subject to the given constraints on 0,’s: 


$ ninpis., 0) др 
ii m«p(0,,....0,) 00, 


Of Re oes 
TR qj = 1,... 8) es (18.5.1) 


fis. ..., Ө) = fE = 1, 2, ...,t). 


Then (i) there exists one and only one solution (6,, .... 6,), (£4, ..., Ё) of these equa- 
tions such that 6’6 in probability as 00. (ii) This 6’ has the further property 
that, as n — оо, this sampling distribution of E(n;—np;(ĝ,, ..., 8,))?/np,(Oy, ..., 0) 
tends to. the y?-distribution with d.f, (7—1) — (8—0). As in theorem IT, the probability 
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measure фу under which (i) and (ii) are true is of course the one which we obtain by 
sticking into Фф, the true population point 6”. 


This is used in the same general situation that ties in with theorem II, the 
only difference being that here the composite hypothesis is given as H, : p; = pi. 
‚..› 0) Subject to additional constraints on був. Theoretically, under certain reason- 
able restrictions, we might try to use the constraints оп 0j's to eliminate some of the 
0s and express the p;'s in terms of the proper number of independent 0;'s. 


But in doing so we might well obtain the p,’s as functions of these independent 
0s such that they have one set of functional forms for one domain of values of the 
eliminated parameters, another set of functional forms over another domain, and 
so on. We shall of course be concerned with the functional forms assumed over à 
sufficiently small neighbourhood enclosing the true (but unknown) parameter point. 
But this is not directly known and hence in general this problem cannot be thrown 
back directly on. theorem II. A simple illustration will make this clear. Suppose 
that p; = pA. > O) G4 = 1,2: 7 > s) and we have furthermore the hypothesis: 
62 + 63 = 1 and that (0,. 05) may take all values on the euclidean plane. Then we 
have 0, = S VUE so that we have p; — р(М1—@....) or Pi = p(—V1—8, 
Oy, ..., 0,) according as the eliminated parameter 0; is + ve or — ve. Tt is this that 
prevents a direct appeal being made to theorem TI. 


But in most practical situations, it is far more convenient to use the customary 
method of Lagrangian multipliers. and this theorem provides the justification for that. 


Theorem IV. With a general ф of the form n! П 77 / П n! (under En,— n and 


29 —1, 2 = 0), suppose that we have the constraints fj(P1; ---: р) = 0(j = 1, 2, 
i 
„эв «rand X p= 1 is one of the s constraints), where Јуѕ are defined over an interval 


i 
A in the r—dimensional space of the pis such that (a) [js have continuous derivatives 


Of; Of; Гаре 
0)5 and Ii and (b) the sx r matrix JV is of rank s. 
9p; др;дру im 


Next, properly using the condition È p; = 1, let us write down the maximum 


likelihood equations subject to the given constraints on рів: 


м yu 20 (12r) 


mp, ja OD 
filer > Pr) = 9 (j= Туа 8/59 т). .. (18.5.2) 


Then. assuming that the true population parameter point p” = (20, 00, s 20) 
^ 
is an interior point of A, (i) there exists one and only one solution р” = (fs. «++ Pr) 
A 
and (fy, ...; Bs) of (15.5.2) such that pop" in probability as 7, 00. (ii) This p' has the 
r 


further property that, as n 90; the sampling distribution of X (n,—nfj?[nf; tends 
i 
to the j?-distribution with df. r—(r—8) = 8, 
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This is used in the same general situations that tie in with theorems IL or 
III, the only difference being that here the composite hypothesis is given as Hy : 
ДИФ, «+> Pp) = Oj =1,....8). Theoretically it is possible (under certain mild and 
reasonable restrictions) to move back and forth between the set-ups of theorems 
TI, II and ТУ, but in practice, depending upon the form in which the hypothesis is 
put forward, it might algebraically be more convenient to follow one method rather 
than the others, 


Theorem V. If, instead of a ф corresponding to a single multinomial distri- 
bution, we have a à corresponding to the product of several multinomial distributions, , 
given by 

rr si si sc Я 

П Сї П рип ny! |, with Il pj, = ро (Say) = 1, (6 — 1,2, .... 7) 

im for gel je 
then the theorems I-IV will all hold good as »,)s— 20 with »/n held constant, with 
the difference that (i) the statistic concerned will be x (1,, — nofi)? nop; and (ii) the 


limiting sampling distribution will be a д? with d.f, = ‘total number of cells—number 
of independent multinomial distributions —number of independent parameters p; that 
are to be estimated from the data. Corresponding to theorems I, IT, III and IV there 
will be respectively (i) a simple hypothesis Hy: py = p?,, (ii) a composite Hy: p; = 
Pi(O;, ..,0,) (with u < s—r where s = У s;), (iii) a composite Hy : Pij = ру(б\,...,0„ 
subject to f,(0,, ..., Ou) = fg (with k = 1,2, ... v < u) and (iv) a composite H, which 


is defined in terms of constraints on pj's of the form: f,(pj’s) = Ok = 1,2, ....u < 

s—r), It should be remembered that all the hypotheses (i), (ii), (iii) and (iv) must be 

во framed as not to violate the basic conditions Ep, = pj = Ui = 1, 2, ..., r). Also, 
J 


even in considering the alternatives, these basic conditions must not be violated. 
Notice that j may be a double or a multiple subscript like jy, ўз, ...,,j;, in other words 
it may be a ‘bivariate’ or a ‘multivariate’ situation. Also i may be a double or a 
multiple subscript like 7,, is, ..., tg; in other words, it may be, in addition, a ‘two-way 
classification’ or a ‘multiway classification’. In practice, the number of categories 
8; of j for a given i will, in general, be independent of i, that is, the same for all i's; 
but it is better to consider a more general theoretical formulation. 

This should be used when we have a mixed model with both ‘variates’ and 
‘ways of classification’ (as e.g. in subsections 15.2.2, 15.3.2, 15.3.3 and 15.3.4). 


Theorem VY. Under a hypergeometric ó(e.g. of the type (15.2.3)), as n 


, 
lio 8 
and no's 00, with n and n/n held fixed, the sampling distribution of 


2 
(ws v5 22 Mioi tends to the X*-distribution with d.f. (r—1)(s— 1). 
n n 


The proof of theorem III that has been constructed is on the same lines ‘as 
that of theorems I and II, and is ratherlong. Theorem IV can be thrown back upon 
either theorem II or III, without much difficulty, Theorem V is proved on the same 
lines as theorems I and II. More than one proof of theorem VI are available, but 
a more straight forward and rigorous proof can be constructed on the same lines as 
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in Feller's book [12]. All this material on theorems III-VI will be given in a later 
monograph, 

Going back to theorems ITI-V, we may note that in each case a part of what 
is included in the hypothesis might be moved over into the model and the rest retained 
under the hypothesis. This would mean that under the relevant non-null hypothesis 
(in connection with the power of the test) the part that comes under the hypothesis 
could be violated but not the part that has been moved over into the model, This 
will not affect the distribution on the null-hypothesis, i.e., the significance points, eto., 

„ but would of course change the structure of the power function. For example, going 
back to theorem III, we can, if we wish, make p, = p(f,. ..., 06.) (i = 1,2, r > 8) 
a part of our model and define the null hypothesis as fi(f. ...,,) = fp (k = 1, 2,5, 
1 — 8). Under this changed set-up even if we used the same test (for the null hypothesis) 
as discussed in connection with the original set-up of théorem ITI, we would have 
a power function which would be different from the one associated with the original 
set-up. But of course another test (which is not discussed here) with a greater power 
would be more appropriate in this situation. It is obvious that we can introduce 
similar changes in the original set-up of theorems IV and V. 

It may be further observed in connection with theorems II— V, that each 
gives two main results listed as (i) and (ii). For example, if we denote the maximum 
likelihood estimate of the parameter point @ by 6, and the true parameter point by 6°, 
theorem If gives that under Ha and for large з, (i) 66° in probability and (ii) 
i (n,—np,)*/np, tends to have the x? distribution with appropriate degrees of freedom. 
Now result (i) will hold if we take for 6 any BAN [30] or best asymptotically normal 
estimate and not just the maximum likelihood estimate which itself is of course a BAN 
estimate. For example, the minimum ү? estimate would be one such and so also 
the minimum x? estimate whose yj is defined by Ў (n,—np,)?/n, Next, result (ii) can 
be replaced by the result that under Hy and large n, 


(iii) Ў [n,—np(BAN)}'/n, or X. [m,—np(BAN)}/np(BAN) 
imt 41 


each tends to have ү? distribution with the same degrees of freedom as for (ii), We 
here denote by p(BAN) the value of p, obtained by substituting in p(0) any BAN 
estimate of Ө, We have of course similar results associated with theorems ТІ, IV 
and V. This gives us a good deal of latitude and leeway во far as large sample tests of 
the hypothesis associated with theorems II-V are concerned, In a sense this has been 
adequately proved in [30], but a proof which is more on the lines of the present 
development will be given in a later monograph. 

15.6. Large sample y? tests of the null hypotheses in a two-way table, 

15.6.1. The problem of section 15.2.1. We consider section 15.2.1, start from 
(15.2.1.1), maximise log d, with respect to p, 8 and ров subject to Хр, = Уру -1 


(using Lagrangian multipliers) and end up with the maximum likelihood solutions: 
Pig : npn and Po; = љот. ‘The number of independent parameters estimated from 
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the data is r+s—2 and hence by section 15.5 the test of independence here 
is based on a statistic which has the x?-distribution with degrees of freedom rs—1 
—(r-+-s—2) = (r—1)(s—1) and whose form is 


2 2 
ty, |* Nig "05 \* 

( пу 7140. w ) (nj Rio Noy ) 
e Am n um n (15.6.1.1) 
a — = 2% 
ij | T е 

п. o т то Mo} 
n n n 


15.6.2. The problem of section 15.2.2. We start from (15.2.2.1) and maximise 
log ф, with respect to qy,'s subject to У до, = land end up with the maximum likelihood 
J 


solutions: goj = туул. The number of independent parameters estimated from 
the data is s—1 and hence by section 15.5 the test here is to be based on a statistic 
which has the y?-distribution with degrees of freedom r(s—1)—(s—1) = (r—1)(s—1) 


and whose form is 


т, To p (n Tio Poj 
; oT 0 * on SESS? 2 (15.6.21) 
os LX 
E то 10 ti Molo) 
n n 


15.6.3. The problem of section 15.2.3. We start from (15.2.3). Here we have 
already (under the null hypothesis) p;; = Niño [т and using the remarks of section 
15.5 we note that the test is based on a statistic having the j?-distribution with 
degrees of freedom rs—(r4-s— 1) = (r—1)(s 1) and the form 


У (поты)? | Nig Noy ... (15.6.3.1) 
we n mus 


15.7. Large sample x? tests of the null hypotheses in a three-way table. 
15.7.1. The problems of 15.3.1, i.e., where “i, <) and ‘k are all ‘variates’. 
15.7.1а. The problems of 15.3.1a. 
Independence between ‘i’ and *j |“. Under H, of (15.3.1.1) we shall have 

bo ~ П. (рае Pox! Pow) Tie, Se XB SU 


To test the hypothesis here we maximise log à, with respect to ps, роь з and Poop S 
subject to = Pior = Ход = Poor and E Poor = 1. and end up with the maximum 
7 5 


likelihood solutions : j;y = "oim, Рок = Moje!” and Door = оу. The number of 
independent parameters estimated from the data is (r—1)t+-(s—1)t-+-(t—1). And hence 
by section 15.4.2 the test of conditional independence is here based on a statistic 
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which has the y?-distribution with degrees of freedom rst—1—t(r—1)—t(s—1)—(t—1) 
= 1(r—1)(s—1) and whose form is 


X ( па пото)? "отоу ы. (18.7.1:2) 
ijk Nook Noor ` 


Independence between ‘i’ and ‘k’ and also between ‘j’ and ‘K’. This can be 
handled exactly on the lines of section 6 and will not be discussed separately. 


Independence between ‘i’, ‘j’ and ‘k’. To test this we start from the hypothesis 
of (15.3.1.3) giving 


o ~ d (Pioo Pojo Рош)“, .. (5.7.1.3) 


maximise log фо with respect to piqy’S, Pojo 8 and Poop S subject to Epio = E Pojo = 
i i 

Epor = l, and end up with the maximum likelihood solutions: fioo = %oo/%s Dojo 
k 

= no [n. and Poor = Noor! The number of independent parameters estimated from 
the data is (r--s-I-L—3) and hence by section 15.3.2 the test is here based on a statistic 
which has the y?-distribution with degrees of freedom rst— 1—(r-+s+t—3) = rst—r—s 
—i+2, and whose form is 

(15.7.1.4) 


1001030006 ШШ ШҮП 
5 > n? | | nn 

15.7.1b. The problems of 15.3.1b. 

Independence between ‘(i.j.) and ‘K. Under (15.3.1.4) we shall have 


$o~ П (Py Pou) . (15.7.1.5) 


To test this hypothesis we maximise log фо with respect to Pijo'S and ро S subject 
to X Pyr = E Poor = 1 and end up with the maximum likelihood solutions: f; — %jo/” 
ij k 


and Poor = 0/7. The number of independent parameters estimated from the data 
is (rs—1)-+-(t—1) and hence by section 15.5 the test is based on a statistic having the 
X?-distribution with d.f. rst —1—[(rs—1)--(L—1)] = (rs—1)(t— 1) and having the form 


2 
[а ac ШШ | Tijo орь, 
s: n n 


> 
ijk 


(15.7.1.6) 
Independence between ‘i’ and K’ and between ‘j’ and ‘k’. Since this can be 
handled on the same lines as in section 15.6, it will not be separately discussed. 


The ‘no interaction’ hypothesis of (15.3.2.6). This has been discussed in detail 
in [47] and will not be given here. The test will be based on a statistic having the 
x?-distribution with d.f. (r—1)(s—1)(¢—1) and having a rather complicated form which 
will be reproduced in a later monograph. urn E 

17 
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_ 15.7.2. The problems of 15.3.2, i.e., when * and ‘f’ are ‘variates’ and ‘k’ is. a 
‘way of classification . i 
15.7.2a. The problems of 15.3.2a. 
Independence between *? and `]' for each k. Under H, of (15.3.2.1) we start 


from 


do ~ (Dior Poe) 5, 2. (015:7.2.1) 


and maximise log фу with respect to pjjs and Paps Subject to >р = E фо = Poo 
M 2 


= 1, and end up with the maximum likelihood solutions: foor = Mion oon» Pose = 
LOT The number of independent parameters to be estimated from the data 
is ((r—1)--t(s—1) and hence by section 15.5 the test here is to be based on a statistic 
having the y?-distribution with df. t(rs—L)—t(r—1)—t(s—1) = t(r—1)(s—1) and 
having the form 


fi 2 2 Я 
х > (пат. та поду [ng . ТЕ) 
£ Tor 00k 


ij 


The problems under (15.3.2.2) or (15.3.2.3) will not be discussed separately. 
15.7.2b. The problems of 15.3.2b. 


The hypothesis that pij, îs independent of ‘Ie, i.e., that pj, is а pure function 
of (i, jy. Under Hy of (15.2.4) we start from 


po~ П god". ve (15.7.2.3) 


ELENA) ur n Pes 


maximise log фо with respect to Фуу в Subject to X dijo = 1, and end up with the maxi- 
i,j : x 31 


mum likelihood solutions: Tuo = ngon. The number of independent parameters to: 
be estimated from the data is.(rs—1) and hence by section 15.5 the test is to be based 
ona statistic having the y?-dist2ibution with d.f. (rs—1)—(rs—1) = (rs— 1)(£— 1) and 
having the form ЖЕЗ b 


z [z (rios пи)? [ny сү oy ROB 214) 


m 


The problems under (15.3.2.5) or (15.2.2.6) will not be separately discussed 
here. 


15.7.8. The problems of 15.3.3, i.e., when i! is а ‘variate’ and 5 and ‘k’ are 
‘ways. of classification’. 


15.7.3a. The problem of 15.3.За. 
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Т The hypothesis that for ату ‘K’, pix is independent of ‘j’, i.e., that ру, s а pure 
function ‘of ‘i’ and ‘k’. Under H, of (15.3.3.1) we start from 


po~ П ait’. „э (15.7.3.1) 


ijk 
and maximise log фе with respect to (5 subject to X qo, = 1, and end up with 
i 


the maximum likelihood solution Шы = то. The number of independent 
parameters to be estimated from the data is ((r—1) and hence by section 15.5 the test 
is to be based on a statistic having the x2-distribution with d.f. 802—1) — 7—1) = 
(r—1)(s—1) and having the form 


z|z | m= nos- mor} | my | 2 (15.7.3.2) 


es Moog Nook 


15.7.3b. The problem of 15.3.3b. This will be exactly on the same lines as 
the previous case and will not be discussed separately. We shall also omit a dis- 
cussion of the problem under (15.3.3.3). 

15.7.4. The problems of 15.3.4, i.e., when v is а variate and ‘j’ and ‘k’ are 
ways of classification in the sense of an incomplete design. 


The hypothesis that pj, 18 independent of J’. i-e., that py, is a pure function 
Niok 


of ‘i’ and ‘K. We start from (15.3.4), put Р = ior and thus have d, ~ Пф › 

: ijk 

maximise log фо with respect to dio; 8 subject to È qi, = 1 and end up with a solution 
i 


qu, 8 in terms of тув which is a set of functions of m/s of the same structure 
as the corresponding least squares solutions in linear estimation. One or two 
such solutions for some linked block designs will be discussed in a later paper. 
However, this solution, stuck into the *y* functions will have the x?-distribution 


with d.f. (r—1) x 1j—(ri—1). 
ј=1 


The hypothesis that Pix ї8 independent of ‘j’, i.e., that ру is a pure function of 
‘i’ and ‘j’ can be handled on exactly similar lines and need not be separately considered. 

15.8. Linear hypothesis. Linear hypothesis in the sense of chapter 12, 
on the p’s or the logarithms of the p’s, can be put forward, distinguishing as in 
chapter 12, between the model and the hypothesis, and such hypothesis can be tested 


either in terms of x? or in terms of 7, in either case, substituting for the unknown 


"free or nuisance parameters any BAN estimates and in particular, say the maximum 


estimate. There are theorems in this 


likelihood or minimum 3? or minimum дї 
nalysis of variance theorems in the 


sector closely analogous to thcse leastsquares anda 
customary set-up, most of which have been considered in chapter 12, In terms of 
this it is possible to develop and study the analogoues of most of the things we custo- 
marily do in the usual uninormal or multinormal analysis of variance, including 
contrasts in general and ‘main effects’ and “nteractions’, etc., in particular. If some 


numerical quantities or measures are attached to the categories we can-also, in terms 


132 SOME NONPARAMETRIC GENERALIZATIONS 


of such numerical measuros, study the hypothesis of equality of the population 
‘means’ or other lincar hypothesis involving these ‘means’ or population ‘variances’ 
or other ‘parameters’ of the probability distributions. This will be discussed in a 
later monograph. 

15.9. Asymptotic independence of test criteria in certain situations. In 
many situations in which a particular hypothesis Ho with an associated x? is the 
intersetion of several hypothesis Hj. Hoz ete., with associated 3, x3, etc., it so 
happeas that x =x? +x +ete. and that X: А8, eto., are also independently distributed 
but unlike what happens in ordinary least squares analysis of variance set-up, the 
additivity isnot in the usual algebraic sense; itis only in probability and asymptotically 
as n>% and the independence is also in the asymptotic sense. Take, for example, the 
hypothesis (15.3.1.1), (15.3.1.2) and (15.3.1.3), and let us call them Ну, (Но, Hos) 
and H, We note that Hy = Ay. (1 Hos ( Но. 

Let the associated y?'s be denoted by yj, мапа yj. Then, in this case, it has 
been shown [25], that, in large samples and under the null hypothesis Ho, Xi, х and 
x are independent central y?'s and РАА э x? dn probability. We have an 
exactly similar situation for the group of hypotheses (15.3.1.4), (15.3.1.5) and (15.3.1.6). 
These are situations in multivariate analysis. There are similar situations in analysis 
of variance also, for example, with the group of hypothesis (15.3.2.4), (15.3.2.5) and 
(15.3.2.6) or with the group (15.3.3.1), (15.3.3.2) and (15.3.3.3). But this will not be 
true, for example, with a similar group of hypotheses on an incomplete block design 
or more general types of designs indicated in section 15.3.4. For linear hypotheses 
on p’s or their logarithms, the mathematical conditions for this asymptotic indepen- 
dence and asymptotic additively in probability are strikingly similar to the corres- 
ponding conditions for the customary least squares set-up discussed in chapter 14. 
For more general types of hypotheses under more general types of models these condi- 
tions are a little more complicated with no obvious analogue in the usual least squares 
theory developed so far. All this will be discussed in a later monograph. 

15.10. On asymptotic power functions. For analogous null hypotheses under 
different probability models we have, in many situations, eventually the same x? 
with the same distribution under the respective null hypotheses. This is exempli- 
fied in sections 15.6.1, 15.6.2 and 15.6.3, also again in 15.7.1, 15.7.2b and 15.7.32 and 
soon. But the power, of theso tests, which depend upon the distribution on the res- 
peetive nonnull hypotheses of the corresponding statistics, are not comparable and 
in that sense different. It is well known [30] that these tests are consistent i.e., 
that in large samples these powers tend to 1 in each case. But the asymptotic 


powers in the sense of Pitman and Lehmann can be obtained and compared. The 


asymptotie power function (for analogous hypotheses under different probability 
models) have different structures. Some results are given in the following paragraphs. 
A systematic development including proofs will be given in a later monograph. 

15.10.1. The asymptotic power function connected with theorem I of section 
15.5. Let us consider an alternative H, (depending on ») given by 


US 
Hy: Din = + -—= (6 —1,2,..,7; DP, = Dp? 1). (15.10.1.1) 
À a/n i i 


— 
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where ðs are fixed. Then, as n — со, under H,, X? = X (n,—np?)2/np? tends to 
iel 


have the noncentral y2-distribution with d.f.(r— 1) and a noncentrality parameter 
f. 
A= x 02/p9. 


15.10.2. The asymptotic power function connected with theorem II of section 
15.5. Suppose we have an alternative (which has to be simple in this case) Н„ given by 
9; 


H,,-: Pin p(09, 00, ..., 09)-- @ = l, t; EPa = У) = 1)... (15.10.2.1) 
Vn i ead 


Then, as n — oo, under H, 
=z (ттр, .... BPO: ... 8) 


tends to have the noncentral y?-distribution with d.f. (7—1) —8 and a non-centrality 
parameter A = 8] — B(B' B)! В']8, where 8'(1xr) is a row vector with elements 
Oil pO, ...,.00) (i —.1,2, ..., r) and B stands for the r x s matrix Ei 
2 
(Торе 10, 0.8): 
15.10.3. The asymptotic power function connected with case (ii) of theorem V 
of section 15.5. Consider an H,, given by 


д, ; T 
, 09) "o with u <s—r and в = X s, and also of course 
il 


Н, : Pijn = pA; en 


E Pin = Pion = 1 ye dort (15.10.3.1) 


Then under H, and as поо subject to q; = n/n being held constant with 
к=з 1 Dose Кү 


№ = E (попара, Второ, +++» ba): 


tJ 


will tend to have the noncentral xe-distribution with the same d.f. as indicated there 
and with a noncentrality parameter A = 8[I— B(B'B)3B']6, where 
о 


V di Opi n T 88 "o 
ГГ] EAR with = 1,2,...‚, 4; »= 1,2,...,f. and 


k= 1, 2,..., wand where 8'(1 x s) is a row vector with elements ôy A/q| VP -> 08): 


Bis Xu) = { 


Another way to compare the relative efficiency of two comparable and consis- 
tent tests in a particular situation would be to consider the ratio of the exact probability 
of the second kind of error for the two tests and study the limiting form of this ratio 
as л э co. This also will be discussed in a later monograph. 

15.11. Remarks on more general decision problems. The problems discussed 
in this monograph, whether based on the ‘normality’ assumption as in the previous 
chapters or on the nonparametric model as in this chapter, have been either in terms 
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of the Neyman-Pearson*testing of hypothesis or, in several cases, in terms of confidence 
bounds on meaningful sets of parametric functions which might be regarded ds natural 
measures ‘of departure from certain null hypotheses. In many situations, however, 
itis of considerably more physical interest to consider more general decision problems. 
For example, in the analysis of variance situations with say ¢ treatments (whether 
on the ‘normal’ assumption or on the nonparametric model) we may be likely to 
be far more interested on a decision rule for picking out the ‘best’ or ranking the t 
treatments in terms of some characteristic. The decision rule has to have certain 
desirable (if not always optimum) properties in terms of some rational criteria. 
Some such decision rules already developed, both on ‘normal’ variate data and on 
‘categorical’ data, and on various types of problems including those of factor analysis 
and classification will be discussed in a later monograph. { 


15.12. Some remarks on factorial experiments. Looking for a possible moti- 
vation behind the customary (and mostly ‘normal’ variate) analysis, one can not help 
feeling that factorial experiments (whether on the ‘normal’ variate type of data or 
more general types of data) present a problem which is essentially different from 
that of the rest of analysis of variance, e.g., the usual tests of significance of treatment 
differences. Assume, for simplicity, in the beginning, that there is just one factor 
at, say, k levels. One might regard these as treatments, and test whether there 
are significant differences between these, or, in terms of some characteristic, pick 
out the ‘best’ among these or rank these in-some order. But that doesnot appear to be 
the relevant question here. The (second) characteristic in terms of which we have the 
levels seems to be a continuous variate which is observed at k levels for practical 
convenience, and what is of interest seems to be to lay down a statistical rule by which 
we can, in terms of the observations at discrete levels, decide about the ‘best’ or 
‘optimum’ point, the ‘best’ or ‘optimum’ being in relation to the first characteristic. 
Likewise, taking for example, two factors at Ё and / levels respectively the problem 
seems to be not to test whether there are significant differences between these kl com- 
binations regarded as treatments (which would, really, be a linear problem) or to pick 
out the ‘best’ among these or to rank these (in terms of some characteristic) which 
again would be each a really linear problem. It seems that there is a (second) charac- 
teristic in terms of which we have the-k levels of the first factor, and a third charac- 
teristic in terms of which we have the / levels of the second factor, both these (second 
and third) characteristics being supposed to be continuous variates. The problem 
is to lay down a statistical rule by which we can decide about the ‘best’ or ‘optimum’ 
point (in relation to the first characteristic) on the plane of the second and third 
characteristics, regarded as two continuous variates, the decision rule being in terms 
of observations at the fl discrete level combinations. This of course can be generalized 
Xo several factors. "The customary analysis into ‘main-effects’, ‘interactions’ of various 
orders; confounding etc., all seem to point very strongly in this direction. Some work 
has already been done from this standpoint and further work is under way. This 
will be discussed. inca later monograph. 


APPENDIX 1 
Some Preliminary Results in Matrix Theory 


(A.1.1): Given four matrices А(рх p). B(pxq). Claxp) and D(qxq), if D is 
non-singular, then 


= |D| |4—BD30| 
0 


А В| jå Bi) р) _ О(рхд) 4—Вр `0 В| = 

=| S = =|D||A—BD 0], 
Gap C р-ро Iq) 0 D 

(А.1.2): v[A(px4)B(qxs)] < min [r(A), 7(B)], where min (wv. y) denotes 
the lesser of two real numbers x and y. 

(4.1.3): r[A(px q)] = [Bip x p)A(Q x q)] = rAG x q)C( x 4)]. 
if B and © are non-singular, 

(A.1.4): r[A(px q)] = v[A'(qx p)] = r[AGQ x 9)A' (2x р)]. 

(A.1.5): trLA(pxq)B(q x p)] = triB(q x p)A(p x q)]. 

Proof: If A = (а) and В = (by), then by the definition of trace we have 


Pod 1 Д 
(4 В) = X E абу = 2 3 bj; a;; = tr(BA). 


i=] j=l i= 


(4.1.6): r[A(px q)] = A(x a)B(q x 0] = {бв х р)А(ф xa), 
if q <t: p <s and В and C are respectively of ranks q and p. 
Proof: Using (A.1.2)-(A.1.3) we have 
AQ xq) = rtAG X q)B(qx0B'(0x q)] < min [r(AB), (B)], 


< АВ). But (AB) < r(A), whence (А) = (АВ). Likewise, starting with 

A'O’ and noting that (СА) = 7(A’C’), we d have, in an exactly similar manner, 
(СА) = r(A), which completes the proof of (A.1.6). 

(A.1.7): If L(pxm) (р < т) is subject to L,L,-— Цр), there exists ат 


Г.п рт) such that | A | is |. L will be called an arbitrary completion of Ly. 


(A.L8): If M(pxp) is symmetric and at least p.s.d. of rank r( < p), then, 
out of the p «(Ms (i.e., roots of the determinantal equation in c: | М—с1| = 0), r are 
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positive and the rest, p—r in number, are zero, If r = p, the number of non-zero roots 
will of course be p. E 

(A.1.9): If Мүрхр) is symmetric and at least p.s.d. of rank r( < р) and 
(рх р) is symmetric and p.d., there are exactly r positive roots of the following 
equation in с: |M,—cM,|= 0. the rest, p—r in number, being 0. If r = p, the number 
of positive roots will of course be p. 

(А.1.10): X(pxm) X'(nx p) will be symmetric and at least p.s.d. of the same 1 
rank as X or X', the common rank r being < min (p, n), where the symbol (which will 
be frequently used later) denotes the lesser of p and n. It is easy to see that if p S n» 
and X is of rank p, then XX’ is p.d. 

(A.1.11): Jf A(qx q) is symmetric p.d., Bip xq) A(qx4) B'(qxgp) is symmetric 
and at least p.s.d. of the same rank as B. 

' Proof: Since A is symmetric p.d., there exists, by (A.3.9), a non-singular 
(хф such that A = т ї'. Hence BAB’ = (ВТВ?) , which, by (4.1.10), 
is symmetric and at least p.s.d. of the same rank as BT. But BT is of the same rank 
as B, since Т is non-singular, whence the theorem follows. 

(A.1.12): If A(pxp) is symmetric and at least p.s.d. of rank r <p and 
В(р хр) is non-singular, BAB' is symmetric and at least p.s.d. of rank r. 

Proof: If A is symmetric and at least p.s.d. of rank r, then by (A.3.10), there 


exist a non-singular T(r x) anda Трг x r) such that without any loss of generality 


we can put А = E (7%: 7. Therefore, BAB’ = B( (D. 1) 8(12:)) which, 


by (A.1.10), is symmetric and at least p.s.d. of the same rank as B | T |. But, since 
2 


е Р qur EE СИЕ 

В is non-singular and | mq. |’ obviously of rank r, therefore, B | Е | is of rank 7 
2 Ty 

and thus BAB’ is of rank r. 

(A.1.13): If My(p xp) is symmetric and at least p.s.d. of тапет ( < p) and 
M.(p x p) is symmetric p.d., then (i) all the roots of the equation in c: |M,—cM,| = 0 
are zero if and only if M, = 0, and (ii) all the roots are unity if and only if M, = My. 

Proof: Part (i) of (A.1.13) is a direct consequence of (A.1.9). To prove part 
(ii), put c = 1—e. We have shen the equation in е :|(M,—M,)-+eM,| = 0 whence it 
follows that all roots of the equation are zero, (i.e., all roots of the equation in c are 
unity), if and only if M;— = 0, i.e., M, = M, which proves part (ii) of (A.1.13). 

(А.1.14): If M is a (p--q) x (p--q) symmetric matriz shown as, say, 


is EX p | 
Mi ; | 
М» Mad 9 

p q 


and if M, is non-singular, then (i) Mı, —Mı Mz Муз is symmetric, and (ii) of rank 
r—q, where q is (evidently) the rank of Mss, and r denotes the rank of M (evidently 
satisfying q <7 < p4-q). 
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Proof: Part (i) is obvious if we remember that Mj, Maz (and thus Mg?) 
are symmetric and so also И„М M/,. For part (ii) we first observe that the rank 
of M would be unaltered if it were pre-multiplied and/or post-multiplied by two con- 
formable non-singular matrices. Post-multiply M by the conformable non-singular 


matrix (of rank p--q): 
1 [4] | p 


p q 


and premultiply by the transpose of this matrix. Then we have: 


Цр) —МьМ My М Цр) 0 
rank of M — rank of 
0 1а) Mi My —М М l(q) 


ў E NE 0 ] 

i.e., rank of . But the rank of this last martix is 
0 Mas 

evidently the same as that of Ms, (which is д) plus that of (M, —M;,Msi Mis). This 

proves part (ii) of (A.1.14). 

(A.1.15): If M has the same structure as in (A.1.14) and is, in addition, at 
least p.s.d. of rank r(q <r < pd) then М, Mas Mid Mis is also at least p.s.d. of 
rank r—q. 

Proof: Since M is symmetric and at least p.s.d. of rank r(g < r < p--q). 
premultiplying and post-multiplying it by the same conformable non-singular matrices 
as in the proof of (A.1.14) and using next (A.1.12), we observe that 


КЕ 0 ] p 


о И, 
р 7 


is at least p.s.d. of rank r. Hence М, М, М9 Муз ів evidently at least p.s.d. and 
since (A.1.14) shows that it is of rank r—q, the theorem (A.1.15) follows. 


My us p 


Mi My 
p q 


(A.1.16): If M = | is symmetric and at least p.s.d. of rank 
q 


"(9 <7 < р+9), and if p < q and My, and М, are both non-singular (i.e., in this 
situation both p.d. of ranks p and q respectively) and if s denotes the rank of 
M,,.(px@ (evidently s <P < q) then the p roots of the p-th degree equation in 
e: |еМ„—М„Мж Mss | =0 have the following properties; (i) 0 < all cs <1, (ii) 
out of the p св, r—q are + 1 and the rest, p—(r—9) (= pt+q—r <p) in number. 
are 1; (iii) also out of the p ©з, s(« p)are #0 and the rest, p—s( < p) are = 0. 
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Proof: We note that since М» and hence Mb is p.d. of rank q and М, 
is of rank s, therefore, from (A.1.11), My; Mz} М1» is symmetric and at least. p.s.d. 
of rank з («<р <q). Also My(p xp) is supposed to be p.d.. Hence by (A.1.9), out 
of the p roots of the equation in с: |eM,,— My Mid Mjs| = 0, 5 are>0 and the rest, 
p—s in number, are = 0. 

Next, putting e = 1—e, we have the equation in e: [Ma — (M3; — MyM M 15) | 
= 0. But M, is symmetric p.d. and, by (A.1.14) and (A.1.15), М„—Мы„М г Mis 
is symmetric and at least p.s.d. of rank r—q( < p). Hence, out of the p roots of the 
equation in e, r—q are < 0 and the rest, p—(r—4) (= p--q—r < p) in number, are 
— 0. Since c = 1—e, this means that, out of the p roots of the equations in c, r—q 
are < 1 and the rest, p-+q—r in number, are = 1. This completes the proof of 
(А.1.16). 

(А.1.17): With the same set-up as in (A.1.16), the roots of the equation in 
e: [cM — Mas Msi Mia) = 0 are all zero if and only if the rank of My, is zero, i.e.. 
My, is the null matrix. 

“This is a direct consequence of (A.1.16). With regard to theorems (A.1.16) 
and (A.1.17) we observe that in statistical applications we shall always be considering 
the special case, > = p+q, that is, the case where M is symmetric p.d.. In this 
situation we state and prove two theorems on transformations, (A.3.16) and (A.3.17). 

(A.1.18): Every non-zero characteristic root of A(pxq) B(qxp) is a (non- 
zero) characteristic root of B(qx p) A(p xq) and vice versa. 

Proof: If сїз any (non-zero) characteristic root of AB, we have by definition, 
| AB—cI | = 0 ог, by using (A.1.1), 


| ef A p 
(A.1.18,1) = 0. 
B I q 
p q 


el B q 
(A.1.18.2) Sei AE. 
A I p 
` q p 
(A.1.18.3) | BA—cI | = 0, which proves (A.1.18). 


There is, in fact, a stronger result than (A.1.18), namely that, not only is every nonzero 
characteristic root of AB a root of BA and vice versa, but that each such root has the 
same multiplicity in relation to both matrices AB and BA. This follows if we notice 
that the left sides of (A.1.18.1) and (A.1.18.2) are the characteristic functions of AB 
and BA and then relate AB to BA by using (A.1.1). 


(А.1.19): (i) If B(pxp) is non-singular, the roots of the equation im 
e: |A(px p)—cB(pxp)| = 0 are the same as the characteristic roots of AB or of 
B-A; and (ii) in (A.1.16) the roots of the equation in с: |cM3,— Mis М Mis | = 0 are 
the same as the characteristic roots of My My. Mil Mj, or of Mj4Msi3Mis Мур (with 
the exception of zero roots in the case where p < q). The proof is obvious. 


cS 
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(A.1.20): tr AQ» p)] = с; (4), (A)...6;, (A), where tr, A stands 


pi 
ipi. p= * 
for the sum of all txt minors (found by the intersection of any t rows of A with t columns 
bearing the same number). and, in particular, 


X au and tr, A =Й с = |A]. 


il {=1 il 


p 
= Sa) 
(A.1.18) coupled with (A.1.20) supplies another proof of the relation: 

tr (AB) = tr (ВА) (see (A.1.5)). 


(A.1.21): If (a) d, < all (AB) < 0,4, > 0), then (b) (d) (В) < (А) 
< (da tr(B) (t= 1, 2, ..., p), where A and Bare two рхр p.d. matrices. Notice 
that (b) is a necessary (though not a sufficient) condition for (a). 

Proof: Yt is easy to check that “d, < all (AB)? e (A—d,B) is p.d. 
€— (A, — dB) (t= 1... p) is p.d. (where A,—d,B, is a submatrix formed by 
the intersection of any t rows of (A—d,B) with ! columns bearing the same numbers) 
& d <alle(A,B;!) (t= 1...) Now, if all c(A,B;!) > d,, one consequence 
is that 


(A.1.31.1) П (А,В?) > (Фу, ies А | | Be | > a 


ї=1 


For a given t, summing over different possible submatrices, we have 
(A.1.21.2) tr,A > (dy) tr,B. 


Using the same kind of argument for the other half of the inequality 
and remembering that t = 1, 2, ..., P: and combining, we have the following result. 


(A.1.21.3) If d, < all c(AB) < dg, then (dy tr(B) < tr, (A) 
< (do) (B) (t= 1, 2,.... p 


By a slight rephrasing (which is obviously permissible here) we have the result (A.1.21). 


(A.1,22): Jf A(p xp) is symmetric p.d. and B(p хр) is symmetric and at least 
p.s.d., then (i) all c(AB) are non-negative and (ii) с(А)с(В) < all c(AB) < c(A) (В), 
aw а: 


min min ти тат 


where (M) and (M) stand respectively for the largest and smallest roots (both non- 
MESE any aM es is symmetric and at least p.s.d. [45] 

Proof: By (A.3.3) there are |. matrices L4 (px p) and Ly (px p) such that 
A= ТАРП and В = Ly Dyml’z, and thus AB = І.Р, Грир Lg. 

Now using (A.1.18) (and noting that here p = q. во that all characteristic roots 
are the same in both products), we have the two-way relation 


(АВ) = (Dy вв LLa) = (Dua M Dag M); 


Notice that MM’ = L',L,li,L, = LiL, = I(p) (since 


where M stands for Ly Lp. 
so that M itself is |. Also note that DygyM is 


L, and Lg are each |)- 
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non-singular since M, being | . is non-singular, and Р, (д) is non-singular, созше all 
the c(A)'s are positive. 


Using (A.1.18) again we find that c(AB) = c(D ja Мрав — D ru) 
and since Г, (ру is obviously symmetric p.s.d. by virtue of B being p.s.d., we notice 
by using (A.1.11) that D дау MD,g, M'D j 4) is symmetric and at least p.s.d., 
and thus all с( А В) are non-negative. This proves part (i). For part (ii), let us go back 
to D, 4M D, M', denote by A; and jj the characteristic roots of A and B, observe 
that here all A; > 0 and all и; > 0, and next observe that, if с is to be a characteristic 
root of AB (here all roots are non-negative), there exists a set of (real) numbers 2, 
Vs, ..., 2), not all of which are zero, such that the following set of equations are satis- 
fied. 


(A.1.22.1) 5 Лутту), = ca; (i = 1, 2, ..., p) (notice that (M). = (M)y). 
n 


Remembering that here A, > 0 and jj > 0 (both sets being real), dividing 
by A,, and squaring any member of (A.1.22.1) and summing over i = 1,2, ..., p, we 
have 


(A.1.22.2) X E тутт Cy = c? 5 а А AP. 
P jh isl 


Now, since M is |, we have X mym; = ду, (where д іѕ the Kronecker symbol), 
i 


80 that (A.1.22.2) reduces to 


(A.1.22,3) с У А = z LFM Myt » 
i je 


It is easy to check that the coefficients of A? on the left hand side and those of и? on 
the right hand side are each non-negative. Hence, if we replace all ju,’s by Hmax 
and all A/'s by Ag, the right hand side is increased (or at least not diminished) and 
the left hand side is diminished (or at least not increased). We have thus 


(A.1.22.4) (с2/А2 ах) z аР pax У E May my, gy Ur, Le, «Лх У Oper Ty 
J К j 

(since M is |), i.e., < wax z aj. Since У 2? is positive, it follows that c? < Айу ых 

i.e., © € Amy ах (taking iis positive square root on both sides). Thus we have 


(A.1.22.5) all (АВ) < сь, (4) cna (B). 


Likewise in (A.1.22,3), replacing all A,’s by Amin and all дув by д. and arguing in a 
similar manner, we have 


(A.1.22.6) Cmin(A) (В) < all (АВ). 


Combining (А.1.22.5) and (A.1.22.6) we have part (ii) of (4.1.22). 
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Replacing A by a complex non-singular A, B by any complex B, remember- 
ing that ДА" is hermitian p.d. and BB* is hermitian and at least p.s.d. we have the 
following more general theorem, proved elsewhere [45]: 


(A.1.23): eu (44*) e, BB*) < all (AB) (AB) < c4, (44*) cnar (BB*). 


However, this result will not be needed in the present monograph, although a special 
case will be needed. 


Put В = I and let A be a real matrix with real roots. If A is real symmetric 
this will be true but this might be true even if 4 were real but not symmetric. We 
can now put А* = A’ and have, as a special case of (A.1.23), the following: 

(A.1.24):  Cmin(AA’) < all (A) < сь (АА!). 

The following matrix lemma is also repeatedly used in the text: 

(А.1.95): c (4 B7) (ВО) < all (AC) < es, (AB7) (ВС), 
where A, О and Band hence B-) are real symmetric positive definite matrices of order 
p each. 

Proof: Using (A.3.9), put B= ТТ". We have now, 
(A.1.25.1) Cmax (AB) Cmax (BO) = ma (A 0-0) eus (P PO) 


Cmax (Т-А) (T'O T), using (А.1.18), > Cmax (FACT), using (A.1.22) and 
(А.1.12), that is > Cmax (AC), using (A.1.18). 


The other side of the inequality in (A.1.25) follows in a similar fashion and completes 
the proof of (A.1.25). 


APPENDIX 2 
Some Results in Quadratic Forms 


(A.2.1): If A(pxp) is symmetric and at least p.s.d. of rank r( < p), then 
(i)a' (1x p) A(px p) а(рх1) is at least а p.s.d. quadratic form in аз (i= 1, ..., р), 
(ii) the stationary values of a' Аа[а'а (under variation of a over all non-null a's) are the 
characteristic roots of A (all non-negative) and (iii) in particular, the largest and smallest 
values of a’ Aa/a‘a (under variation of a) are the largest and smallest characteristic roots 
of A. 

Proof: Part (i) is given in all textbooks and need not be proved. For part 
(ii) putting a^4a/a'a = and differentiating с with respect to the elements of a, we have 
the vector equation giving the stationary values ofc: Aa—ca = 0. whence by eli- 
minating a we have, for the stationary values of с, the p-th degree determinantal equa- 
tion іп с: |A—el| = 0. The roots of this are the so-called characteristic roots of 
A. which proves part (ii). In this case the proof of part (iii) is obvious and will not 
be separately discussed. 

(А.2.2): If B(p p) is symmetric p.d. and A(p x p) issymmeltric and at least p.s.d. of 
rank r( < p), then for all non-null a's (i) a'(1 x p) A(p x p)a(p x 1)/а'(1 xp) B(px p) ap» 1) 
is non-negative, (i) the stationary values of a' Aa/a' Ba (under variation of a) are the roots 
of the determinantal equation in с: | A—cB| = 0 and (iii) in particular, the largest 
and smallest values of а’ Aaja' Ba are the largest and smallest roots of the determinantal 
equation. 

Proof: Part (i) is obvious. For part (ii), putting a’Aa/a’Ba = c and differ- 
entiating c with respect to the elements of a; we have the vector equation giving the 
stationary values of c: 4a—cBa = 0, whence by eliminating a we have, for the sta- 
tionary values of c, the p-th degree determinantal equation in c: |A—cB| = 0, 
which proves part (ii). ‘The proof of part (iii) is now obvious. 


My | p 


Mi М2 q 
p q - 


(p < qy is symmetric p.d. (from which 


(A.2.3): If “=f 


it follows easily that Му and М, are each symmetric p.d.), then, for all non-null 
a(p x1) and alq Xx 1), () [aM ,,as/(a144,,2, (a5M spa )isnon-negative, (ii) the stationary 
values of this expression are the roots of the equation in c: [c3 — M; Mz M| —0 
and (iii) in particular, the largest and smallest values of the expression are the largest 
and smallest roots of the determinantal equation. 

Proof: Part (i) is obvious. For part (ii), putting a{ M,a, = ta, akat 
and a Maa, = аз, and (алз)2/а05 = c (say), and differentiating c with respect to 
the elements of a, and a, we have the vector equations giving the stationary values 
of e: M3585—(115/,,)41,,a = 0 and ау 2/,,—(а/а)а5 Moo = 0 or (215/155).M ,,8,— 311,8, 
=0; 
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- oy‘) Ebminatihg'a;.and a; between the two: vector equations, we have, for the 


stationary, values. of.c, the p-th degree determinantal equation in с: , 


My, (а/а) М, | 


(А.2.3.1) EN) 
(419/42) Moo Mis Е | 


or, by using (A.1.1) and remombering that c = а?,/а1,а, 


*(A.2.3.2) | |cM,,—M,» Mot Mj,| = 0, 


Fey i 
which proves part (ii). The proof of part (iii) is now obvious. 
“My My, Мь—р 
(A.2.4): If M=| Mj, M; М» |1 (p <q) is symmetric p.d., then, 
; : 
Mi, Ma Myr 
p q r 

х 9 


for all xoa ally try) far allir) x " (i) 


M Ms My м ^ Mas м» 
(a; aiiai] | ajfas| jaa] 
Us М. Из» Mis Mas M3; Ma, 
is non-negative, (ii) the stationary values of this expression are the roots of the equation 
in c: 
feta ata №5) (Ma 25,45 317) х 
Lu E LAM М.М My) 1(M1,— ММ; M1)| 50 
and (iii) in particular, the largest and the smallest values of the expression are the largest 
and smallest Toots of the determinantal equation. "E { 


= Proof: As ators: part @ @i ‘is obvious. ‘For part (ii) putting the expression 
under (i) = с(зау) апа procesding in exactly the same manner as in (A.2.3) we have 
for the stationary values of c, the determinantal equation in e: | 


| [2a ы [re = | 2 ; 


[и | My 


Ig d Es E SERRE 
ijs | Mose cL Mss > Мз Sy eben, sfr 


(4.2.4.1) 
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As in (A.1.14)—(A.1.15), premultiply the left hand side of (A.2.4.1) by the determi- 
nant of the non-singular matrix F (and postmultiply by its transpose), where Ё is 
given by 


[^ I r 
p r 
уй 
[ м.м | 4 
0 
0 £ r 
q r 
The equation now reduces to 
d ERSTE 0 ] ы a 0 ] 
с 
0 Mg [^ М | 
I | = 0 
| Ка 0 ] mee 0 | | 
| [7 My о My 
or, 
(4.2.4.2) [Mı — Mss Mid M39) 4 35— M, Msi Ms) UM sy МММ) 


X (Mia — MMs Mis)| = 0. 


Arguing as in (A.1.14) —(A.1.16) it is easy to see that (i) the roots of this p-th 
degree equation in c all lie between 0 and 1, (ii) if M of (A.2.4) is p.d., then all roots 
are < 1 and (iii) if M,,— M1,M 51 M5, is of rank r (< p), then r of these roots are > 0 
and the rest, i.e., p—rare = 0. Alltherootsarezero if and only if М, М, М M5, —0. 


(A.2.5: If М(рхр) is symmetric and at least p.s.d., the statement: 
“g < a'(1x p) M(pxp)a(px1)/a'a < д for all non-null a" is exactly equivalent 
fo “gy < с S c, < gs, " where c, and c, stand for the smallest and largest characteristic 
roots (both non-negative) of M. Notice that the last statement gives also the lowest 
permissible value of g, and the highest permissible value of g,, both in terms of the 
roots of M. The proof is obvious from (A.2.1). 


(A.2.6): If М, (pxp) is symmetric p.d. and M, (px р) is symmetric and at 
least p.s.d., the statement: “g, < a (Vx p)My(p x p)a(p x 1)/a'(1 x p) М Ap Xx p)a(p x 1)< gs 
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for all non-null a” is exactly equivalent to “g, < c, < cy < 95° where c, and c, stand 
for the smallest and largest roots of the the equation in c(all positive): 


|M,—cM,| = о. 


Notice that the last statement gives also, in terms of the roots of M,M>', the lowest 
permissible value of g, and the highest permissible value of g,. The theorem is a direct 
consequence of (A.2.2). 


(A.2.7): The statement; "x'(Lxp)x(px1) < glg > 0)" is exactly equivalen 
' to “—y/g < х(1хр)а(рх1) < --A/g (for all a subject o a'a = 1)" The proof 
follows easily from Cauchy's inequality in algebra. 


TAPPENDIX 2551 )-is dee nsns 
Some Resvlts in ‘Trensformations 


iuis (Азар, аха) = Anyai х1), J; eje Ais | hen vex = yA'Ay = yy. 


(A. 3. 2): dj x( mx) A(nx ny(n x 1) and. u(nx1)- = A(nx n)v(n x 1), where 
A is 1, then хи = y'A'Av = y'v. (de 

(А.3.3): If Pern is symmetric mu а ‘Teast. 8. й, of rank "n Y then de- 
noting by c the roots c(M) of (A.1.8),. there éxisis ат “orthogonal aie: A(p xp) (ло! 
necessarily unique) such that М = AD,A'. 

(А.3.4): Under the conditions of (A.1.9). namely that M,'px p) is symmetric 
and at least p.s.d. of rank r( < p) and M,(px>p) is symmetric p.d., there exists а non- 
singular Alp Хр) (not necessarily unique) such that M, = AD,A' and M, = AA’. 

(A.3.5): The matrix A of (A.3.3) will be unique, except for a post-factor 
D, if M is p.d. and all cMYs are distinct. [31]. 

Proof: Suppose there are two orthogonal A's, say A, and Ag, satisfying the 
condition of (4.3.3). Then we have 4,D,4; = A,D,A$ ог Az!A,D, = D,A;(A1)! 
ог 4,4,0, = D,A54, (since for an orthogonal A, A~! = A’). If we now denote 
A54, by B with elements b;;, then the above equation gives : 


i? 


(A.3.5.1) bic; = cb; or 01(6;—с,) = 0. 


oy 


Thus, if è Aj and с, 52 с, bj; = 0, which means that B is a diagonal matrix D, with 

elements, say 5,,..., b. 
since Р, = В = AjAy,, we have 

(A.3.5.2) DiDi = Di = A54;4,4, = Цр), 

so that b? = +1 and so b; = +1, (i = 1,2, ..., p). 

(A.3.5.3) Thus D, = D, and hence А54, = D, or A, = A,D,; 


this proves (A.3.5). We note that A can thus be made unique by adopting the con- 
vention, say, that its first row be positive. It is easy to check that the transformaticn 
is now one-to-one, 

(А.3.6): If X(pxm) (p < m) is of rank p (in which case, by (A.1.10), X X* 
is symmetric p.d.), then there exists a transformation X(pxn)= А(рхр) D,(pxp) 
X L(pxn), where A is |, LL'— I(p) ата where с^ are the characteristic rccts (all positive) 
of the matrix X Х'. If all c's are distinct this transformation is unique cacept for a post- 
factor D, to go with A. 


Proof: By (A.3.3) there exists an orthogonal А(рхр), which may not 
be unique, such that XX’ = AD,A'. We now define a L(pxn) by X = AD ¿L 
and note that, given X and hence c’s and A (which we can find but which may not be 
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unique), this is a linear equation in Z uniquely solvable in terms of the above ele- 
ments. * Also LY = Dij, A3XX'A4'D,,,, = Dy ,A-AD,A'A'3Dy,, = I(p. We 
have thus the transformation X = АРГ, where A is | and LL’ = I(p). Notice 
that, if the c's are distinct, A is unique except for a post-factor D, and that L will 
go with A, being defined by L= D,j,,,A-X. This proves (А.3.6). It is easy to 
check that for distinct roots the transformation can be made one to one by adopting 
the convention, say, that the first row of A be positive. 

(A.3.7): The matrix A of (A.3.4) will be unique, except for a factor Dy if 
M, is p.d. and all the roots are distinct [31] 

Proof: Suppose there are two non-singular A’s, say A, and Ay, satisfying the 
conditions of (A.3.4). Then we have 7 k 


(A:3.74) 1" | A,D,Ai = ASD,A and A,A; = Aå 

These lead, after a little reduction, to. 

(A.3.7.2) 4514,0, = D,A31A, or BD, = D,B, where Az'A, = B. 

If now В = (bj), say, then (A.3.7.2) leads to 

(A.3.7.3) bic, = obj, or 606—6) = 0 or by 01+ =] and c; 5 cj. 
Thus В = Р, (say) and во we have 

(A.3.7.4) DD; = Dj = ВВ = As AA (A8!) = Ag A545(A5)-1 — р) 
so that b, = £1, ie, D, = Dp | | ; 
(A.3.7.5) "Thus 4314, = Dy or 4; = Др, 


which proves (A.3.7). As before, we note that A can be made unique by adopting 
the convention, say, that its first row be positive. Check that the transformation in 
this case is one-to-one. E3551 | ‘ei 

(A.3.8): If X\(pxm), Xp x ny), (p < n, Ny) are each of rank p (in which 
case, by (А.1.10), ХХ, and X,X; are both symmetric p.d.), then there exists a transfor- 
mation X(pxm) = A(px p)Dpxp)Lp x n), and Xp Xn) = A(px p) І4рхт) 
where A is non-singular, c's are therools (all positive) of the equation | X,X 1—cXsX;] =0, 
and -L,Li = L,Ls = Щр). If all св are distinct then this transformation is unique 
except for a:post-factor D, to go with A. 

Proof: “By (А.3.4) there exists a non-singular A, which may not be 
unique, such that X,Xj — AD,A' and X,X, — AA’. We iow define І(рхт) 
and L(pxnj) by Ху = AD „Л, and X, = AL, and note that, given X, X, and св 
and A (which may not be unique), J, and L are uniquely аш 4 ree of these. 
Also LL, = Diy zeA 7X, XA Diy ie = I(p) and LI; T ATX, XA EON Tp). ; This 
‘proves. the existence of the transformation (A.3.8). Notice that if all c's are 
sdistinet, then by (A.3.7) A is unique except for a post factor 2, and that L and Dn 
will go with A being defined by L, = Ру 4, AX, and L, = А X,. Check that the 
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transformation in this case is one-to-one if we adopt the convention, say, that the 
first row of A is to be positive. $ 

(А.3.9): If M(pxp) is symmetric and p.d., then there exists a mon-singular 
T(pxp) such that M = PT", and this T is unique except for a post factor D, and. so 
d will be called near unique Т is a triangular matriz. 

Proof of the near uniqueness: Suppose there are two Т”, say T, and Ñ, satis- 
fying the condition. Notice from (A.1.10) that since M is p.d. T must necessarily 
be non-singular. Thus we have 


(A.3.9.1) ЇЇ, = ЇЇ, or Pam = qma. 


Now making use of the remarks made after (1.1) we note that Tz T, is a triangular 
matrix with the same configuration as Ñ, and 7,7)? of opposite configuration. 
Thus it is obvious that 


(A.3.9.2) Т1, = D, (вау), 


whence D,D, = D, = PPT (THY = ТТ, Т!) = 1р). so that a; = + 1, 
ie, D, = D,. Thus 


(А.3.9.3) ff, = D, or Т, = T,D,, 


which proves the near uniqueness. It is easy to check that 7 can be made unique by 
adopting the convention, say, that the diagonal elements of T be positive. The 
transformation in this case is one-to-one. 


(4.8.10: If M = Bs | р is symmetric and p.s.d. of rank р and if 


Mi M4 9 
p q 


the first p rows can be taken as a basis, then there exists a non-singular T pxp) and 


aT(qxp) such that 
| My Ms | | 9; ] M US 
E) > 1: 42h 
Mi My T, 


m 5 
and Jurthermore| mi] is unique except for a post-factor Dy. 
2 


Proof: Since M is symmetric it is evident that, if the first p rows of M can 
be taken as a basis, then the first p columns of M also can be taken as a basis. Thus 
no row of M is а linear function of the other rows of M;; and no column of M, is a 
linear function of the other columns of M,,. Hence M,, is non-singular. Here, of 
course, M,, is symmetric p.d.. We can also look at the picture in a reverse way, 
namely that, since M is symmetric p.s.d. of rank p, we can find a non-singular 
principal minor of order p which is of course symmetric p.d. Renumbering the rows 
(and the corresponding columns) of that principal minor, we can call it Му. Then 
it is easy to show in this case that, we can take the first p rows or the first p columns 
as a basis. 
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Now notice that if the first p rows can be taken as a basis then there exists a 
non-singular А(р Xq) such that Mj}, = AM,, and М, = A M,» Combining the two 
we have Ms, = ММ} М,» (note that M, is non-singular and thus we can take the 
inverse). We next observe that in this set-up M,, is p.d. Therefore by (A.3.9), 
we can find a non-singular 7,(p xp), unique except for a post-factor Dp, such that 
M, = 7,0". Now find a T; defined by T; = 771M,» and check that M1, = (1173) 
= 7,1", and М» = Mi M} M,» = T,T;, which proves (A.3.10). We observe that, 
as in (A.3.9), [| can be made unique by adopting the convention that the dia- 


gonal elements of T. be positive. Check that the transformation is now one- 
to-one. 


(A.3.11 If X(pxn) (p < n) is of rank т(< p)such that, say, the first r rows of 
X can be taken as a basis, then there exists a transformation 


X,]|r SGT, 
| | = L(rxn), 
X,J p—r p-r LT, 
n r 


where LL' = I(r) and n is non-singular and unique except for a post-factor Dy 

Proof: By (A.1.10) X,Xj is symmetric p.d. of rank r and by (A.3.9) there 
exists a (non-singular) Т, (unique except for a post-factor D,) such that ХХ; = Tie 
We now define an L by L(rxn) = Ti(rxr) X(rxn) and note that given X, and 
hence Ñ, (which is unique except for a post-factor Dj), L is uniquely solvable in terms 
of these. Also LL’ = PAX X(T)! = Ñ i T (T = I(r). Next define a T, 
by TL = X, or T,LL/ = X,L/ or T, = X,L' and note that, given X,, X, and hence 
L (which is near unique), 7’, is also uniquely solvable in terms of these. We note further 
that now X, = Т„ = Tf; X, = B(p—rxr)X, (say, where B = T,T;'. This 
is obviously the condition that X be of rank r and X, bea basis. Hence the transfor- 
mation is proved to exist with the near uniqueness already stated. By adopting a 
convention, say that of (A.3.10), the transformation can be checked to be one-to-one. 


(А.3.12): If X(p xm), X4pxngj) (p € ШИЛ are each of rank p (see(A.3.8)), 
then there exists a transformation: X,(px m) == T(px p) L(px p)D (рх p)Ly(p Xn) 
and X,(p XN) = T(p x p)Ls(p X n), where 1 is non-singular, L is |, LL, = 151% 
= I(p) and the c's are the (all positive) roots of the equation in сг |X ,X,—cX,X5| 
=0. If the čs are distinct, the transformation can be made one-to-one by letting Т 
have a positive diagonal. 

Proof: Start from the transformation (A.3.8) and, using (A.3.11), put A(p xp) 
= T(pxp) Lp xp) where Lis |.. Next put LL, = L, and note that LL; = LLL 
xL = Цр). The proof of near uniqueness in the case of distinct roots follows along 
the same lines as in (A.3.8) and (A.3.11). This completes the proof of (A.3.12). 

(А.3.13): If Мурхфр) is symmetric p.d. and Мүр xp) is symmetric p.s.d. 
of rank r(< p) then there exists a transformation, which, without any loss of genera- 


lity, we can write as 
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(A.3.13.1) M,(px p) = p—r E Drxr) [41:43] v, and 

A, p—r r 

\ T 

(A.3.13.2) : Mápx5) = p—r & ^] B ж 

d A, А ^ Арг 

r: p-—r р—т r 
where the c's of D} stand for the r non-zero roots of the equation in c:|M,—cM, | = 0 

А, A, 


and where A = | | is non-singular, If the non-zero rools are distinct, the 
А A, Ay 
matrix A is unique except for а post-factor D,(p). 

Proof: Using (А.1.9) and (A.3.4) we can’ find a non-singular oki. such 
that M(pxp) = б(рхр) D(pxp) G(pxp) and M,(pxp)— G(pxp) G(pxp), 
where the c's of D, are the roots of the equation in c: |M,—cM,|= 0. We recall 
that under the conditions of the problem r of these roots are positive and the rest zero. 
Let us call these r positive roots сү, Сз, ..., C, Then the r columns of the G matrix 
(and the rows of the б” matrix) that go with these positive c’s have to be numbered 
Пол 

Now denoting the matrix formed by these r columns of G(pxp) by Alp xr), 
the remaining submatrix of G(px p) by В(рх pr) we can set 


(A.3.13.3) M(pxp) = A(pxr) D(rxr) А (тх), 
Mypxp) = А : B] E i 
т p—r 
B 
p 


Since G is non-singular, [A : B]p is non-singular, and hence A(px7) is of rank r and 
B(pxp—r) is of rank (p—r). We can, therefore, choose p—r rows from B to form a 
non-singular (square) matrix of order p—7. Let these rows be numbered 1, 2, ..., 
p—r. Let us denote the matrix formed by these p—r rows of B(px p—r) by 
Врт =r xp—r) and the EE d submatrix of B(pxp—r) by B,(rx p— pr), Let 


us denote the submatrix formed by the corresponding rows of A(pxr) by A,(p p—rx r) 
and A,(rxr). We can now rewrite (A.3.13.3) as 


(3.13.4) — M, — p—r [ ч р(х) [AL А] т, 


pt T 
Ay 
r 


M үр "bs ad p | r 
probs eae bt sd pap BS >) 


ДЕ Б су 
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where В; is non-singular and ps zx is also non-singular. Notice that with re- 
A, B, 


numbering of the rows of B and of A, i.e., of G, the rows (and the associated) columns 
of M, and M, have also to be renumbered. Assuming now that B, is non-singular we 
can, by (A.3.11), finda transformation By(p—r x p—r) = Ap—r»p—r) L(p—1 xP -r) 
where Lis |. Now put Byrx p—r) = A«rxp—") L(p—rxp—r) (which defines A, 
in a unique way in terms of B, and D). Thus we have 


E, a | | [“ à r 
E 4, B, А, А, 0 LJ p—r 


T p-T 


and thus (A.3.13.4) is replaced by 


(A.3.18.5) E 1] bs El 
i Bea b ац. 


(A.3.13.3) and (A.3.13.5) taken together give us (A.3.13.1) and (A.3.13.2). Now for 
the near uniqueness in the case of distinct roots under D}, remember that 


ry [Dz 0 Ane” il de 
DIS , put U= and write M, —UD,U' 
p—r L0 0. Ay A, j 
r.p—f 


and M, = UU’. Tf now there is another matrix V satisfying the same conditions, 


then arguing in the same manner as in (A.3.7) we have V-3UD, = D,V-3U or 


B,D,— D,B, where B,— V3U = 00), say. This, as in (A.3.7) leads to the 
equation Pe; = ob), whence Mp— 0 if i Aj and e з 6. Note that here c; = 0 


This shows that the B, matrix is of the form 


NEO i D o 
[21] = | | (say). 
0 jsolidd p—r 0! 8 


Remembering that B, = VU, we have 


(i rh esp 


De 0 пүү мү! 
"IBI = yAUU'(V-2y = VAVV'(YV)3 = Д(р), 
0 85 


whence D, = D; апа S is seen to be an orthogonal matrix. Using the structure of U 


and V and the relation VU = B, we have 
U, al | Varas ] Fe d [| VD, fs ] 
| UU md ty, Vd ров Ya 7,841 


А-3 
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where Sis an | matrix. But, since Ё = Ё 8, therefore, S must also be a triangular 
matrix. Hence S is necessarily of the form D,(p—r). Thus B, is of the form D,(p), 
which proves the near uniqueness in the case of distinct non-zero roots. This com- 
pletes the proof of (A.3.13). In this case the transformation is easily checked to be 
one-to-one if we adopt the convention that the first row of A, and the diagonal elements 


of A, be positive. 


(А.3.14): If Хүрхт,) (p > m) be of rank n, such that the last n, rows form 
a square matrix which is non-singular and Xp XN) (p < т) be of rank p. then there 


exists a transformation 


p—n, | Ui 
X,(pXm) = D jg (n4 хт) (т X n3), 
m LU, 


T 


p—n [U 0, 
Xap XN) = Lp X ns), 
4 


^ P—Ny 


such that L, is | and LL = р) where єз stand for the non-zero roots of the 
U, Ü, 

is non-singular, Also 
U, U, 


if all the non-zero roots c are distinct, U is unique except for a post-factor Dy, Notice 


equation in с: | X4X1—cX,X5 | = 0, and U = | 


that, by (A.1.9), all the c's are anyway non-negative and n, of them are positive, the rest 
being zero. 


not. necessarily unique 


U, 0, 0—7 
Proof: Ву (A.3.13) there exists ап 


U; U ny, 
A, p—m 
Ў U, U, Ü] FU; ? 
such that ХХ; = Dj(n,xnU;: U] апа Х,Х5 = " ; 
0, Uy Us U' U; 


Now define an L(m x) and L,(px ng) by 


p—m [ Yi p—n [Uy 
Xpxm) = = D (т Xn) Ly(m x i) 


Ny ds n, LU, 
Ny Ny 
p—m [Uy 05 
Xpxng) = | Lp X n), 
m LU, 4 
nı p—m 


and notice that L, is given uniquely by L, = U—X,, interms of X, and U which itself 
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may not be unique and similarly Г, is given uniquely (in the same sense) by 
Lı = Di; U >Y, Next we check that 


L,D, = U7X,X;(U-+)' = U4UU"(U’)-* = р) and 
LL, = Dy j Uz YY (Uz) Dy jc 
= D, j Ug UD, UAU) Du к= I(n;). 


We observe also if the non-zero roots are unique, then, by (A.3.13), U is unique except 
for a post-faetor D,(p) and thus L, and L, which hang on U are also indeterminate 
to the same extent. This completes the proof of (A.3.14). As in the case of (A.3.13) 
also here, for distinct roots the transformation can be made one-to-one by adopting 
the same convention as at the end of (4.3.13). 

(А.3.15): If Х(рхт), Хрхть) (m « p <n) are of ranks m, and p 
respectively, then there exists a transformation: Xi(ni xp) = L(mxm) D (т Хт) 
x Ly(my x pP (px p) and Xap xn) = Ї(рхр) Гарт), where Lis L, ТАЛЛ = In), 
L,L, = Қр) and св are the m characteristic roots (all positive) of ХХХ) X. 
For distinct roots the transformation can be made one-to-one by letting T have а 
positive diagonal. 


Proof: Using (A.3.11), put Хурхтз) = (рур) LaApX n), subject to 
LL, = Цр) and now using (A.3.6) put X;(n, x p(T") (px p) = М x m)D jm x n) 
x Li(m хр), where Lis |, L,Li, = I(n;) and c's are the roots of X(T TAX Le., 
of SACU SE i.e., of X(X,X2)3X, . Post-multiplying both sides by 7’ we have: 
X; = LD 1,1" and for X, we already have X, = T'L,. Near uniqueness, in the case 
of distinct roots, follows along the same lines as in (A.3.11) and (A.3.14). Check, by 
using (A.1.18), that these c’s of (A.3.15) are the same as the non-zero roots of the equa- 
tion in c (considered in (A.3.14)): |XiXi—cX,X| = 0. 


My 23 p 


(А.3.16): yt =| { 
Mis М 
p q 


(p < q) is symmetric p.d. 
q 


(note that, in. this situation, M,, and My, are both necessarily symmetric p.d.) and 
if M4 is of rank в < p < q) and if D(sxs) is the diagonal matrix based on the 8 
non-zero roots of the p-th degree equation in e: | cM, Мам M3,| = 0, then there exist 
non-singular A(p XP) and B(qxq) which, without any loss of generality in the sense of 
(A.3.13), we can take to be of the structure 


А, A,]p—s B,  Bg|]gq—s 
A= | and B= 

A, 4,48 B, Bs 

s p—s з q—s 


(A, and By being non-singular), such that My(pxp) = A(pxp) Ахт), My(qxq) 


Di 0(8х@0— ; 
— ваха) Ваха) and. Мырхф = АФхр) [a 8х0) "m хе] B'qxq) 


dE [2] Не Bi]; also, if the c's are distinct, A is unique except for a post-factor 
2 
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D,(p) and, for a given choice of A, B is unique except for а post-factor 9—8) to go 
with B, х 
Proof: Since M,, is symmetric p.d. and МММ; is symmetric and at 
-least p.s.d. of rank s( < p), there exists, by and in the sense of (A.3.13), a transforma- 
tion Mj, = AA’ and 


D, | 0 А, A; p—s 
МММ; = A | | A’, where А = | ] 
0 0 А, A,J 8 
s p—s 


is non-singular, A, is non-singular and ¢’s stand for the s non-zero roots of the equa- 
tion in с, the rest, p—s in number, being zero. Next, since M,,(qxq) is symmetric 
p.d. it follows from (A.3.3) that there is an orthogonal (qq) such that My. = ED,E' 
where e = (cy, ..., €g) denotes the g characteristic roots (all positive) of the p.d. matrix 
Mg, Substituting this in ММ М\„ we have 


D, 0 
(A.3.16.1) M, {EDE'M = A A’, 
0 0 


or (since E is | and A is non-singular), 


: D, 0] в 
(A.3.16.2) AM EDE MiA = 


0 04 p—s 
8 p—s 
We now define a G,(sxq) by 
(A.3.16.3) D (в хв) @ү(в Xq) = the submatrix formed by the first s rows 


of (AMED; у). 
Itis easy to check that, given the otherelements, (A.3.16.3) defines @, uniquely 
and also that G(s Xq) &i(q Xs) = в). It is well-known that if G(s <q) (s < q) satis- 


fies 0,0, = I(s), then we can adjoin a @(q—sxq) to бү such that | Ga ] isan | 
matrix. With this adjunction we can now write * 


? 8[Dj; O][G]e 
(A.3.16.4) (AMED у) p = | | | | 
q p—s lo 0 Gd q—s 
DES Trio t: 
or 
Dic 0/18 
(A.3.16.5) (APMED MG 6] = [ | : 
0 02 p—s 
з q—s 


Next put (ED,,,)[G; Gj] = F'-! (say), so that 


sr, 
(А.3.16.6) F'(qxq)— | | DjE' 
q—8 LG: 
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(remembering that E is | ). Notice that, given the submatrices of the M matrix, 
we can ‘find a non-singular A, an | W andan | [6] (none being necessarily 


unique) and thus a (non-singular but not necessarily unique) F given by (A.3.16.6). 
Using now (A.3.16.5), (A.3.16.6) and the definition of A (in the beginning of the proof), 
we check that we have non-singular A and F satisfying 


(A.3.16.7) My(pxp) = А(рхр)А(рхр), 
Deco 
My(pxq) = A(p xp) x F'(qx q) and Ma(qxq) = F(qx DEX): 
0 0.1 p—s 
8 p—s 
Fi F4]94—5 
We next partition F into . assume in the sense of A.3.13 that F; 
Е, Fils 
8 9—8 


is non-singular (as we obviously can), note that #, and F, do not occur in the 


factorization of My, and put F, = B, Fe ByFéq—5xq—5) = B,(q—8Xq—8) 
x L(q—s xq—8) (where L is |) and F, (sxq—s) = B,(sxq—s) L(q—sxq—s). As in 
(A.3.13), remembering the structure of A, we now rewrite (A.3.16.7) as 


p—s [4 2, А, EIE 
(A.3.16.8) My(pxp) = 2 , 
gil A, ДА Ао 
8 p—s p—s 8 


p—s[ A, 
My(px4) = A D xs) [By : Bs] 
2 


q—8 8 


q—s ГВ, BB. B] ғ 
and Myy(qx4) = x 
sl B, ВОВ; BiJq—s 


8 0—8 4—8 8 


which establishes the existence of the transformation (A.3.16). 


To prove the near uniqueness of A and B where the c’s are distinct, 
we first recall the definition of A and observe, as in the proof of (A.3.13), that 
A is unique except for a post-factor D,(p). The second equation of (A.3.16.8) shows 
that at this stage B, and B, are unique except for the post-factor that goes with A. 
Now consider the third equation of (А.3.16.8) and partition My into four submatrices 


and rewrite the equation as 
ie Ex 9—8, fee 3t BB, am 


A.3.16.9) = x ; 
ug МӘ B,B, + BBs В,В.-В,В 


1—8 8 


8 


whence, from the relation: B,B,+B,B; = М0), remembering that B, is already 


known and using (A.3.10), we see that B, is uniquely determined except for a post- 
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factor D,(q—s). The equation B,B;--B,B = MS now uniquely defines B, except 
for the post-factors that go with the other Вз. This completes the proof of the near 
uniqueness in the case of distinct св. If = р, i.e., if M, is of rank р, then all roots 
become positive, i.e., D, becomes px p. A becomes a solid matrix while B retains its 
own structure with q—s being replaced by q—p. If q = p, then B itself becomes 
a solid matrix. As before, for the case of distinct roots, the transformation is checked 
to be one-to-one by adopting the convention, say, that the first row of A, and the dia- 
gonal elements of A, and B, are to be positive. 

X, 


p 
(A.3.17): If | | (р < 9. p4-q < n) is of rank (p4-q) 
9 


T: 
n 


and X,X; is also of rank p, then there exists a transformation 
X,(qxn) = (qx q) Lag xn) 


Tet 
and X(pxn) = U(px pD jp xp) My(pxn—q): Му pxalx |7 9 
q 
n 


where Т and U are non-singular, МҮМ, = М.М; = Цр), LaL, = I(g), and Т, is a 
completion of L, (see (A.1.7)) such that] 21] is | and e, = (1—c)/c; or c; = 1/(1+e,) 
А % 


(i = 1, ..., p) and c's are the roots of the equation c: 
[e(X,X1)—(X,X5(X5X2)(X,X1)| = 0. 


Proof: Using(A.3.11) put X,(qxn) = (аха) La(q хт) where T isnon-singular 
and L,L, = I(g). Complete L, by an L, such that [217 ТАЕ 
т 
Now using (A.3.8), put 
Хүрхт) [а (в x n—g) : L3(nxq)] = U(pxp)UD js (nx p) M(pxn—4) : М„(рхф)], 


where U is non-singular, МҮМ, = М.М; = I(p) and e's are the roots of the equation 
ine:  |(X,2,L,X;)—e(X,L5L,X;)| = 0. Multiplying both sides of the X;-equation 


L ALAS 
by [ A and taking into accoùnt the X,-equation we have the transformation 


(A.3.17), except for the required interpretation of e, which is as follows. L, = (T) YE 
so that LjL, = XWTT")3X, Also LiL, = 1—11 = I~ X(PT")X,, Henge 
the equation in e becomes: |X [Z— x qu )J3X4X1—eX, XPT") )A1X,Xi| —.0 


1X5(X_X5) 1X, X4 2 0 (since X, X; = 77) which completes tl.e 


1 
l+e 
proof of (A.3.17). 

(A.3.18): If X= [| ү p (p < 4, p-a < n; rank = p--q) is such that X. PE 


n 
is of rank s < p (in which case it is easy to check that X. ІХ and Х.Х are each 
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symmetric p.d. and. X,X(X4X5)-1 X4X, is symmetric and at least p.s.d. of rank s, во 
that s roots of the p-th degree equation in c: |e(X,X1) —X, X5 X$X5)(X,X1)] = 0 
are positive, the rest being zero), then there exists а transformation 


p—s[A A Dj-.0]s JA ps 
(A.3.18.1) Xjpxm)— | : J | 4 J: 
D ala, At Др гр 
n 


s p—s s p—s 


. p—s[4, 
xs | | оова, 
. s LA: 
TN - 
= q—s[ B, B,| Г, | 
Y (A.3.18.2) X,(qxn) = | li ] f 
528, B, L,Jq—s 
$ 109—8 n 


where the Djs» s) is based on the s positive roots of the equation already mentioned 
and where the A and B are non-singular matrices defined after (A:3.16) by 


A, 47га 4 
(A.3.18.3) Do 1 
As A E А4 
в Вуга B 
DX Y and 
B, BAB, В 


A, A, A, 
ХХ = D;4Bi В = i E 
2 a 


2: 


8 I= 8 
and where the L matrices are subject to 
s Ly 
cs L, , , ] Й 
[L 15 L; І] n = I(p+-4)- 
s Ls s p—s s q—s 
q—s Ly- L 
n 
Proof : 
p—s [Ay A, M]|s 
(A.3.18.4) Put Х(рхт) = and 
s LA, Ay Ly p—s 
в p—s n 
q—s TB, B, Lz] s 
А.3.18.5) X gxn) = ‹ 
д •в, B 19—ғ 
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Now check that, since A and B aré non-singular, the above equation defines M, Ly, 
Ly. Ly uniquely except for the indeterminacy in A and B. Now. using the first two 
equations of (A.3.18.3), it is easy. to check that 


ү L; 
(A.3.18.6) , W : 1] = Кр) and [£5 : 44] = Lg). 
Lg. L, 


Substituting for X, and X, (in terms of the A Band M. Ls, Ly and L,) in the 
third equation of (A.3.18.3) we have 


uA DE рео, Br B; 
(A.3.18.7) ; 
r tup L0 oJ LB В 
4 B 


whence it follows that 


M Ds 0] 
(A.3.18.8) Us : г] = ] a 
L, 0 0.1 p—s 


Let us now put, 
(A.3.18.9) M(sxn) = D,,(5x8) (8 хт) М8 x n), 
which uniquely defines M, in terms of M, L, and c's. 


Now substituting in the equations (A.3.18.6) and (A.3.18.8) for M the right hand side 
of (A.3.18.9), we have 


(A.3.18.10) MLE: Lh, : Li] =[0, 0, 0] and 
(A.3.18.11) Цв) = MM' = 0 5114р p+ ММ! = D,+M,M. 
If follows from (A.3.18.11) that 

(A.3.18.12) МҮМ; = Це) р, = Div 

so that if we put 

(A.3.18.13) M,(sxn) = р viz, (8X 8) (8л), 


we shall have, from (A.3.18.12) and (A.3.18.10) 
(A.3.18.14) LL, = (в) and L[Li, Li, Li] = [0, 0, 0]. 
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Substituting from (A.3.18.13) for M, in (A.3.18.9) we have 
(А.3.18.15) M(sxn) = D pls Xs) Las хт) 4-Р ,ү=(8хв) Lis Хт), 
where L, satisfies (A.3.18.14). 


Now substituting for M from (A.3.18.15) in (A.3.18.4) and using (A.3.18.6), 
(А.3.18.8) and (A.3.18.14) we have 


A A D І.Р L 

(A.3.18.16) dem | 1 '] É 41 1+0. Ls) | 

х ДРА Le 
Ae E Daz 0) [L A, 

= + Руіз апа 
Д Акы А, 0 I4 LI, 5 
Bap B; T 
(A.3.18.17) X,= | | | | ; 

B, B, L, 


where the L’s satisfy 


(A.3.18.18) L, |Z : L i L i Lil = H9). 
1А. 

This proves (A.3.18). If s = p (which is the case that will be actually considered in 

this monograph), La will be absent, and q—s = g—p and we shall have 


Lp 

(A.3.18.19) Xy(pxn) = А(рхр) уу : Djp] | | 
p p Lp 

"n 

q—p [B В, |р 
апа X(qxn) = , 
plB, B, L,Jq—p 

p I-P n 


where the L’s satisfy 


p L, 
(A.3.18.20) p| Ls 7 Lh Li] т = p+). 
q—pL 11 P р q—p 
n 


As to the indeterminacy on the right hand side of (A.3.18.1) and (A.3.18.2) 
(for the case s < p) and of (A.3.18.19) (for the case в = p), it is easy to check that in 
either case, if the non-zero roots are all distinct, there is near uniqueness in the sense 
of (A.3.16), the only indeterminacy arising out of a post-factor D,(p) going with the 


A-4 
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total A matrix and a post-factor D,qg—s) going with B,. In this case the transfor- 
mation can be made one-to-one by adopting the same convention as, say, at the end 


of (A.3.16). 


: “Ху ]р 
(А319): For X=| Xa |4 (р <aptatr < " of rank p--q t") 
De Eg 
т, 


(i) there exists а transformation: X (rxn) = T(rx) Lr Xn) subject to LL; = I(r) 
and 


1 т 
т т 


` І | ner L | n-r 
Хүрхл) = MZ Žil and Xq Xn) = Za? Zal 
n—r r E t n—r r 


where L is just a completion of L so that fee | is |. (ii) Putting M = XX’ (ob- 
з 

serve that, by (A.1.10), M will be symmetric p.d.), the roots of the equation in с, namely 

(A.2.4.1) or (A.2.4.2) are the same as the characteristic roots of (Жуу Zu) (Zn Za) 

X (Za Z3) Zan). 


Proof: The proof of (i) is obvious from the preceding sections. For (ii) we 


observe that L = (T)3X, so that Ly = XQ")? whence 1%» = Хх уе xX, = 
X (XX) 1 Xs- Therefore 1/1, = Іт) — 318 = I(n)—XyXsX3) Xs and thus 
Zai rua ХГХ, vas XXL —X4X43X3) XX; S My — MysM33 Mis» 
ZnZn = ХІХ = XX; X, XXX) XX, = Mi,—M зз Мз Mos; 
and ZyZ5 = XL DX, = X,X,— XX (XX) XoXo = Мь—М»Мзу Mis. 
This completes the proof of (ii). 
(A.3.20): For an M of the structure (A.2.4) there exists the transformation 


Pike ОЁ E IPIE p At 0|0 
M-—q 0 A, | As |? 7 Ne p 0 4510 
pU quad. 0 q—p L 4; AL As 
p q T — ——. 
. 0 I T 


? PEED К 


where A, 0 A, іза non-singular matrix and c’s are the roots of the equation 
0. Ay А, 
Ome cede 


in c, (A.2.4.1) or (A.2.4.2). 
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Proof: We can write 


My Uy, My М-ММ М» М„—М„М М 0 
Mi Ma м. |=| м.-м.мәм Mzy—MagMid My, 0 
Mu Ma My 0 Qe ea 


2 ММ Mis МММ м, 


B +| МММ» МММ» Mos 
: Mis Mog Ms. 
«=, Nx 
E Using(A.3.9) and (A.3.16) wecannow put M5,—4,4;, M; МИ МАА, 
3 М.—М.„М Mog = АА, and М,— М.М Мә = А(рхр) [Ю 0] As(qx q). 


Diug p 
If we next put M,(pxr) = Ajpxr)AQrxr) and М.х) = А(фхт)А (хт), 
we observe that A, and A, are determinate. We check furthermore that now 
МММ; = A545, МММ = АА; and М.И: Ms; = АзА,, so that alto- 
gether we have 


= ААА, АО OVAL taal “АА E 
ve g , П 
M = |а, 414-4445 АА-АА; A,At 
0 
4,4; Дд AA... | 
Ау NOTE А T ID a| A; 0 
= 0 
Op Aga Ay pe E 0/7 Ag | j 
0 NUR ERE A, AL AL- 


which proves (A.3.20). 

(A.3.21) The passage from L matrices to Lr variables. Consider the trans- 
formations (A.3.6). (A.3.8), (A.3.11), (A.3.14), (A.3.15), (A.3.17) and (A.3.18) and 
(А.3.19) and notice that everywhere we have, on the right hand side, à post-factor 
of the form L(px n)p < n) subject to the constraint LL’ = р). Check that the 
actual number of independent constraints is just p(p--1)/2. Suppose now that 
instead of transforming to L subject to LL' — I(p), we take a slightly different set 
of variates in the following way. Putting 


АДИ " 


\ 


Lip xn) = (say), 
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we notice that LE' = Др) €» 11, = ô; (i. j = 1.2. .... p), the kronecker delta, so 
that, by virtue of the p(p--1)/2 constraints, L really consists of pn—p(p--1)/2 
independent elements, although the (p xn) matrix itself is naturally one of pnelements. 
From L let us choose an independent set. say, (11. lias «++ А, пл) (lors leas ses 15, пез) 555 
(lis ly =- 4p, n-p) and let us call this set Lj. Throughout this monograph Ly will 
stand uniformly for this set of variates. 

(A.3.22): Jt will now be shown that if no elements of L are 0, then the corres- 
pondence between Ly and L is one-to-2P. 

Proof: Having regard to the constraint LL’ = /(p). under our set-up, we 
.,n—i) (= Lr say) as the (so-called) 
independent variates and ly (i = 1. 2.....p; j =n—i-+1.....m) (= Lp say) as the 
(so called) dependent variates. This notation will be uniformly followed. We have 
now the following equations in the dependent variates (in terms of the independent): 


are going to treat l (i = 1,2,..., p; j = 1,2 


For the first row of the L matrix 


(A.3.22.1) #,=1- SH. 
j=l 
For the 2-па тоу of the Z matrix С 
р n-a ч n-2 
(А.3.22.2) 1 ол-ла lac lalis = — X lly; natin = 1— X hp 
ja j=l 


And in general for the i-th row of the L matrix (with = 1, 2, ..., p) 


ean p n-i ». nt 
(A.3.22.3) o2àodgl——Xly XE B=1—> В, 
}=л—+1 j= jen-il j=l 
for 7’ =1,2 i—1 


It is easy to see that, for the first row of L, the equation (A.3.22.1) gives (in 
this case) two real and distinct values of /,, in terms of (hy... l. n1) Next, for the 
second row of L, the equations (A.3.22.2) give (in this case) two real and distinct pairs 
of values for (I. ,, ,, lan) in terras of the first row (now supposed to be given), and so 
on. In general, for the i-th row of L, the equations (A.3.22.3) give (in this ease) two 
real and distinct sets of values for UD lip) in terms of the (i—1) previous 


rows (now supposed to be given). This proves (A.3.22). 


APPENDIX 4 


Invariance of the Characteristic Roots under Certain Linear 
Transformations 


(A.4.1): If X(pxn) (p < n) is of rank p (in which case, by (A.1.10), XX «is 
symmetric p.d.), then the characteristic roots of XX’ are invariant under the trans- 
formation: X(pxn) = A(pxp) Y(pxn) B(nxm) where А and B are any two |. 
matrices. y 

Proof: c(XX') = (AYBB'Y'A') = (AY Y'A')sinee B is | )=e(¥ Y'A'A) 
(using (A.1.18)) = e( Y Y’) (since A is | ), which completes the proof of (A.4.1). 


(А.4.2): If XQ(pxm,). Xs(px ng) (p < ni n) ате each of rank plin which 
case, by (A.1.10), X,X{ and X,Xzare both symmetric p.d.), then the characteristic 


roots of (X,XXX) are invariant under the transformation: Х(рхт) = 
А(рхр) Yi(pXm)By(n, xn) and XApX ny) = А(рхр) ҮХрхт,) Bm. Ng). where 
A is any non-singular matrix and B, and By any two | matrices. 
Proof: c[((X,X1( X4X$5)1] = c[(A Y,B, Bi Y{A'\(A Y,B,B,Y3A4’)>] 
= dq(AY,YiA'(AY,Y2A')-1] (since B, and B, are |) 
= qA(Y,Y)(Y,Y2)2 42] = (Y, Y) (YY27424] 
(using (A.1.18)), which completes the proof of (A.4.2). 
(A.4.3): If X\(pxn,) be of rank n(< p) and Xs(pxns)p < ns) of rank 
р, then the characteristic roots of (X,XXX) are invariant under the transfor- 
mation: X,(pxn,) = A(pxp) Ү(рхт) Вт хт) and Xj4pxmsg = A(pxp) 
x Yi(pX mg) Bing Ng), where A is any non-singular matrix and B, and B, two 
arbitrary | matrices. The proof is on the lines of that of (A.4.2) and is thus obvious. 
x, 


) 
(A.4.4): For X — | ] (p < ч. pta < п, rank = p--q) the charac- 
9 


X, 
n 

teristic roots of (X Xi) X XXX) (X Xi) are invariant under the transfor- 
mation: Xy(pxn) = Alpxp) Ү(рхт) B(nxn) and Xqxn) = Ахд) Y«qxo) 
x B(n x n), where A, and A, are any two non-singular matrices and B is any | matrix. 


Proof: o[(X,X{)-\(X,X5)(X_Xs)-(X,X})] 
= (А, У,ВВ'Ү;А;) A, Y, BB Y 4) A,YSBB' Y242)(ASY,BB Y 42)] 
= (АУ, УА) A Y, YA (A Ya YA) (A, УУ А1)(віпсе B is |) 
= d(A)(, Y)9(Y,Y2 Y Y2))(Y,Y2)41] 
= «(ҮЛҮ О УРЫУ -ҖУ„Ү{)А}(А 1 (using (A.1.18)) 
= q(Y4Yi)(Y,Yj( Y, У) (Y,Y$)]. which proves (A.4.4). 


X, |p 

(A.4.5): For X = a (р < q. p+q4+r < n, rank—p--q4-r). 
X, dr 
n 
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the roots of the equation in c of the form (A.2.4.1) i.e., of 


с я Я [^ ES] 
Х.Х Х.Х; PEVA Х.Х; : 
eS 25] Be 
зХ XX sX; Х.Х 
i.e., of a 
(A.4.5.1) | e X, X, : Е 
[X, xg (XX; XJ 
| Xs. X; 


Xs X, 
| 1 3] [ (X; X3 
X; X; 


are invariant under the transformation 


X, |р p | A, 0 Ay Y, |р 
Aegis 0 As A, Y, | q xB(nxm), 
X; r t0 0 A, Үз | r 
n p q r п 
А, 0 A, 
where Bis | and A — E А, a | is any non-singular matrix. 
0 0 A, 


Proof: The proof follows by noting that B will pass out of the picture and 
the equation (A.4.5.1) can be written in terms of Y's and A's as 


A, 4; | | Y, Y, | 

Doy file [Y;: Yj] [Y? : Yi] 

0 As | | Y, Y, | 
B Ca E. 
n A, | | Y, 1: 1% y, 2: Үз] 


then, since A is non-singular, [ 0 4 | and | 4, | are both easily 
5 


checked to be non-singular. 


APPENDIX 5 
Some General Theorems in Jacobians 


(А.5.1): If x(nx1)= A(nxm)y(nx1l) where A is non-singular, then 
J(x : y) = |A]|. 

(A.5.2: If X(mxn) = A(mxm) Y(mxm), where А is non-singular, then 
J(X : Y) = |A|". 

(А.5.3): If X(mxm) = A(mxm) Y(mxn) В(пхт), where A and В are non- 
“singular, then J(X : Y) = |A|"|B|™. ч 

(А.5.4): If Аата Bareeach |, then|A| = |B| = 1 and (A.5.1) and (А.5.2)— 
(А.5.3) will reduce respectively to J(x : y) = 1 and J(X : Y) = 1. 

(А.5.5): Jf y, = filt s Tm Tao s min) (6 = 1, ...,m) where жүз (3 == 1, 
2, ..., m--n) are subject to n Vossio 


dias 1:30. Tuas ua) = (i = m41, ..., m+n), 
then (under the usual conditions for the existence of the Jacobian, including the 


non-vanishing of the numerator and the denominator in the following) we have, [42], 


(fi. io m3 Ља» more) E дб» +++ s fiin) 


as ns Ym € ys ee Em) = Ale, QUE HEN, е. Dae "WR 


Proof: Let us denote by pu , i,j, = 1, ..., т, the partial differential coeffi- 
« 


cient of y; with respect to x, after having expressed y; (i = 1, ..., m) in terms of (2, 
, 2), that is, after eliminating (z,,4, ..., 2,,,) with the help of the constraints, 
Next denote by 


z i |, (i, j = 1, 2, .... m), 


| 
v | 


the absolute value of the determinant of the m xm (square) matrix 


ду; Tu. e 
ЕЯ ; (67 = 02, ..., т): 


(А.5.5.1) Луо Ya Ep o Tm) = а. 9—1, 2) m), 
©; 


Then we have 


of 4 yn Oh Om 


› Мз BSE be Pd m. 
= Гах, SUME Ox, 0x; (9 = 
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Notice that in a or 2 , f; is supposed to be expressed in terms of all the (m+n) 
PA E 

a's and the partial differentiation is supposed to be with respect to 2; or x, assuming 
all the other (m-|-n—1) independent variates to be kept fixed, while in 2 ог 
: 7 ) 
it is supposed that y; (i = 1,2, ..., m) or % (k= m-+1,...,m+n) has first been 
expressed in terms of 2,’s (j = 1, 2, ..., m) and then the partial differentiation is made 
with respect to a particular ту, assuming the other (m—1) ‘independent’ variates to be 
kept fixed. Now from the set of n constraints on ав ( = 1, 2,..., m+n) given 


by the conditions of (A.5.5) we have 


Of, > "EM. Of) Ox ) + 
A.5.5.2 fi, Of) 0n афа, E 
( ) a, Bs DEN 0 (2= +1, ..., тт, ап) = l, ..., m), 


or, in matrix notation, ` 


0, i " А 
(А.5.5.3) ЕЯ = | | ЕЯ (i,k =m+1, ...,m+n;j = 1, ..., m), 


ог Ox, = — 
: x, 


(Note that, by the conditions of (a.5.5),| Л | can be assumed to be non-singular). 
к 


Substituting from (А.5.5.3) in (А.5.5.1) we have 


(А.5.5.4) (уз... You € Cry er Ln) 
£ [4] $ [ af, | | E3 F ү 2 | 
да) КАЙА oe да, Ox; 1] = m+1,...,m-+n 
k= m+1, ...,m-+n j=1,....m 


| | 
m-+1, ..., m--n д |] k= m+1, ...,m--n 


| (by using (A.1.1 
ik, l= m4, ..., MHN ч n 


a] 


Е ЕЯ 
дж Ji $ = 1, ... m+n 


which proves (А.5.5). 


i, j = m+1, .... m+n 
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The real use of this theorem (as also of the next one) is in those situations where 
it would be difficult to express y;’s in terms of vj's (j = 1, ..., m) (after elimination 
Of 2,,.,. ..., 2,4 With the help of the constraints), but where iti is much easier to express 
the Mant hand side of (A.5.5.4) in terms of (а. ..., ж). or where even this explicit’ 
expression is not directly needed. 


(A.5.6): If F9... 9 Ym: Cis ees Cms мру... Gmin) =0 (i = 1l, 2, ... m+n) 
are a set of equations solvable in the real domain in the sense that corresponding to 
real (ал, .... a) we ean find. real (уу. ..., Ym) amd (a, Cmr: s Tmin) then, under the other 
‘usual conditions for the existence of the Jacobian (including the non-vanishing of the 
numerator and the denominator in the following). we have, [42] 


OF, LAS d Bian). OF, . =e) 


Olay, ...; uas) E In: ETR ^a : ER 


Jr hau gd (e m 


Proof: As before we have 


Junii tuts б») = | ENE es dee mM 


But from the basic conditions of (A.5.6) we have 


z ^ дЕ, dy, "t" OF. Ox, дЕ, 
А.5.6.1 У Ee Чё Үү hol IF ical ЫИ 
К) i=) Oy; Ox, | l=mi1 Ox, Ox; ^ Ox 


j 


(k= 1,,.., тт) = 1, 2i m). 
OF, OF, 
дау 
variates NE A 29 1 PI Wis Vmin) and the partial differentiation is with 
respect to Y; or a; Or a5. om all the other (2m-+-n)—1 variates fixed. 


Notice that in — y and ee , Fx is supposed to be expressed in terms of all the (2m --n) 


Also notice that, in ди ог д. y; (ога) (¢ = 1,2, ..., mi l= m+), ..., m+n) 
д2; д2; 
is supposed to be expressed in terms of ;’s (j = 1,2. ..., т) and then the partial 


differentiation is with respect to a particular 2;, keeping all the other (m—1) of the 
xs fixed. 


(A.5.6.1) can be written as 


OF, ду, дЕ, да; OF, | D 
enge lle ime 


where each side is a (m+n) x m matrix. 
A-5 
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Taking, say, the first m rows of this matrix equation (A.5.6.2) we shall have 
the square (m x m) matrix equation: 


P, OF, | дЕ, 
ae ду; П h On, = |= 
(A563)  . E: | | a | f | 25 | | às | | Om | < 
дЕ, 


where now $,j,k, = 1, 2,...,m, and | = 04-1, .... m+n. and E is square 
y 


(m x m). = 
Again taking the last n rows of the matrix equation (A.5.6.2) we have 


OF, a ] 9F, ] OF, ] 
z а UA 2 да; 2 
О СЕЗ 


дЕ, 
where now i,j = 1,2,....m and k,l = m-4-1,..., m+n, so that | 3 — | is now 
vy 


square (п xm). 


Treating (A.5.6.3) and (A.5.6.4) as a pair of simultaneous equations in| 2] 
[E 
d ч Qu, { 7 3S А 
($15: 1587.5 m) and |241 (L=m+1 ... m+n and j= 1, .... m) and solving for 
Lda; | 


them we have for [3] the following: 
(A.5.6:5) | d hee Es oce roget 
om, L Oy; ‚ 0m || дщ ду | 
d а OF, QUE OF, 
(|, джу Om, m, On; j 
OF, | OF, OF, | [9F, 
Ox, | [дл p uy ] 
› 
[| E: CEU e pony 
=| —|— Э Ls 
Oy, да, дэ | » |l 


But we have by (A.1.1). 


Hence 


Oy, 
Ox, 


T TU TIENES SOS E 


а-л 
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and 


. mM 


n 
ӘР, OF, : 
m eC d 
OY, да дЕ, Гав, ӘР, 7] Гор, 
OP, ar, de E Ls Ox, y, || 
” IL oy, ETA 


, Substituting from (A.5.6.6) in (A.5.6.5) we have 


OY; 
Ox; 


(A.5.6.7) J(Uy ccs Ma mp Mu) 
m n 
uc Р, OF, ОЕ, 
а dn Oi (Гоа, 
ӘР, ОР, ОР, OF, 
f “Ox, ду; Ox, 


KP, s Рал) pue COT AMO 


(2, sees Uns SU +++, Cee) ў HY, e Yms +12 9 ал) 


which proves (A.5.6). 


(A.5.5) is really a special case of (A.5.6), which can be shown by putting in 
(A.5:0);; o= yy —fil@y -i ж) (0 = 1,2,..,m) and next Fy = f(a, <..; Yara) 
(i = m+1,..., m+n). that is, by assuming that the last n equations are free from the 
угв. Substituting in the right hand side of (A.5.6.7), we easily check that it goes over 
into the right hand side of (A.5.5.4). 


It seems that (A.5.6) is a very general theorem in Jacobians and yields as 
special cases practically all the usual well-known Jacobian theorems. 


APPENDIX 6 
Jacobians of Certain Specific Transformations 


We shall consider the transformations (A.3.6), (A.3.8), (A.3.11) with rank 
= ps (A.3.14), (A.3.15), (A.3.17) and (A.3.18.19) and. in each case, pass on to L, 
from the postfactor and prefactor of the form L or M (subject to LL’ = I) and 
discuss, for the different cases, the respective Jacobians (i) J(X : My, c8, Ly). 
Gi) J(X,, X5:4, c's, Liy, Lor), (iii) J(X:7', Lj), (iv) J(X, Х.л, Ug, Us, U,, c'8, Lir Lar) 
(v) J(X,, X, : T, св Ly, Lir, La) (vi) J(Xj, Х„:Ї, U, св Mirs М, Lar) and 
(vii) J(X,, X, : A, By, By, Вз, By, св, Lj). where, in (vii). Lrs are ЕУ: the 
(so-called) independent elements formed, as in section (A.3.21). out of the matrices 


"LL p 
L, p 
L, 1 7—р 
п 


We shall first obtain the following two Jacobians which will be basie to the 
derivations of all the other ones. 

(A.6.1): Jacobianof the transformation (A.3.11) (with rank=p), i.e., J(X:T,, Ly) 
where Tp Xp) is non-singular with a positive diagonal. To obtain the Jacobian 
from X to 7 and Ly, we use (А.5.5), remembering that now X= PL takes the place of 
y, = f; and LL'—I(p) = 0 takes the place of f, = 0. We also note that ULE — I(p)) 
= d(LL'). We have now, using (A.5.5), 


PL = PX LE) pny pac Lb) ру, 
ктп ae w = | рО) OL, Г) ш, pak, 


where on the extreme right, for practical usability, everything is expressed in terms 
of 7 and Ly The calculation of the numerator in the Jacobian of (A.6.1), (A.6.2) and 
(A.6.7) can be considerably abridged by expressing, in each case, that numerator 
in terms of Kronecker products (and sums) of ma 


trices, otherwise known as direct 
products and direct sums, However, in this monograph, for expository purposes, a 
more familiar and straightforward, but lengthier method is given in each case. It-is 


hoped that, for each problem, the reader will have no difficulty in y 
main steps by spelling out in further detail on a sheet of paper. 
numerator of (A.6.1.1) we proceed as follows. 


erifying the 
To ealeulate the 


iva cen xi 
X == : oe * = Ў (вау); 
Lp: 2р X, 
hy hin 1j 
L= = 5 (say). 
hy dy Bes 
170 
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Then (A.3.11) 


Also put LL’ = K with elements Ak, (4j = 1.2,.... p; ky = ky). 
can be written as X; = (ty ...£; 0... 0)x D ( = 1,2, ..., p) or 


ta 
la 
hi liz 
я (А.6.1.2) =. zd od] (= 1,9 p) 
л ^ 0 
j 7 
4 ty — 
$ 0 


To calculate aX, LL’) = AX. K) we display below the partial differential coeffi- 
A(T, L) (Т, L) 
cients of X and K (= LL’) with respect to the elements of 7 and L (all elements of L 


being temporarily regarded as independent for purpose of the present differentiation): 


Gr ta 4 d. ф- typ | Aj 1; la I5 

ООО СОС от Поро AP 0 

хао Ny О ЗО ОУ D, D 0 0 
x10 гш О Шур р Ер Р РОЗ ОА 

ШИ pio о 0 

kop 0 2v, 

; lig IH KATA 0 

| 0 
ky | б 0 a irat b 
EUM Oyun DNO я ba 


where D, will stand for a diagonal matrix with diagonal elements all equal to a. 
Recall that x; is lx». 1; is also 1xn(i=1....,p) and К(рхр) has p(p+1)/2 
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independent elements so that the above is really a (2p-4- p(p-- 1)/2) x (mp+-p(p+1)/2) 


matrix. Now put 


еба) My(pip-- 1/2 x p(p--1)/2) = 0: 
О Le ye i 
0 ib 
My p(p+1)/2x np) = 
P 0 
ker) Ube gi 


1, | 6 0 
My (np x p(p4-1)/2) = ; and 
0 „0 ЕМ БОДА 
iF D,, 0 0 0 
D D 0-0 
My(npxnp) = (notice that each Dis n x и). 
D р, ; D 
^ 
Ву (A.1.1) we shall now have 
|M M р 
(A.6:1.4) Sn CP aie Mig [sen E 0 M3, | 
: | Mg Ma | np Ma My, | 


Pert np 


=?" |Ma] 10—20, Mi М„| =?” [Ma] |My, Мә Moy). 
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Recalling the structure of 7 we have 


ieee Oot wil F^ Dai o : 0 
E peg 2 ( Dp: De: . 9 
fix | Mad = | 
[mou e 1 рор Dn» Dev . Dw 
so that |Ma| = ||”. We have furthermore 
(A.6.1.5) LU NEM n TOC Moi mE D К Ors ea 
> yis: 9. eat ОГЭ er gg 
Msz Ms, = 
LL chos. X WERE s CEPR Е 
0 0.0 Qu ДЕ ЕТУЙ) 177 2 о al 
QE [On 70 i 0 0 p 
г, Do». [рр | 
0 ПТ Ма ACRES ipm 0 
0 0 р—1 
gr t» | 
ARO 
0 wr 1 


= Ma Ns, (say) where Na, stands for the right matrix factor. We note that Ns, is 
р(р--1)/2 хр(р--1)/2 and is non-singular if Т is non-singular. We note also that 
M (p(p+1)/2x pn) Mo (pn p(p+1)/2) is р(р--1)/2 хр(р--1)/2 and non-singular, 
so that 


(A.6.1.6) MoM Ma). = |MjMa| [Na]. 
It is easy to check that 


(A.6.1.7) (No) = I tg-!/|| and |7| = Йа. 
isl #=1 


It is also easy to verify, by using the condition LL’ = I(p), that 


(A.6.1.8) |M М, | уту = 1. 
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Henee (A.6.1) will now reduce to, [31, 32]. 


| X = — (X, LL’) JOL’) 
(4.6.1.9) J(X : T. Li) | Т) D L "Tata ү 
=?” Ï pm-in [IEL 4 
i (7,5) » 
so that we have 
(A.6.1.10) dX > J(X : T, LjdT а, 
where J is given by (A.6.1.9). 
It is easy to check that. with n§ = T", we have 
(A.6.1.11) JG i T) — 2 dE agentis, 
i41 
so that 
(A.6.1.12) df > [ness (> fi а) |а, 


Another transformation (together with its Jacobian) that is useful and inter- 
esting is the following: 


Хд 
(А.6.1.13) Х.Х) = 
х, 
9 A + 
ыл ce dun 0 2. 0 bn 
bya Я tnp ~ 1, 


This transformation is obtained if we start out from the transformation (A.3.11), 
cut out the first r rows of X and the first r rows of 7 and assume that the first r rows 
of L are given constants, in other words, that the transformation is from the variable 
and truncated X to the variable and truncated 7' and the variable Кыз 21) and 
that If, ..., I; are assumed to be given constants such that the whole TE = Қр). 16 
is easy to show that, with top: ++ балуу being, say, positive, this transformation 
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is also one-to-one. Let us denote the truncated 7 by 7, 44,5; the truncated L by Бу, 


and the initial block of (constant) L by L,, Then (А.6.1.13) сап be rewritten as 


= Ly, 
(А.6.1.14) Хы = EE > 
Lys 


where the variable L,,,, is subject to 
(А.6.1.15) Іар Lunas = I(p—r) азат, Ly, = 0. 
Li, being a given matrix of constants, subject itself to Ly Га, = I(r). 


It is easy to see that the independent elements of the variable truncated 
L(i.e., of Г,.1р) in this situation can be taken to be the same as of the truncated L,,, 5 
in the original set-up. Let us denote this by Ту ру consisting of L445, o lass 
sey lpi d, a8 elements and the dependent part by L,,,,5. We are now 
interested in the Jacobian (Ху: m L411), Which, by using (A.5.5) subject 
to (A.6.1.15) comes out to be 


(АВВ FSi rir Ша) = 


СА + | 
| (Х,ал, Las Las Lyaglis 


‚1, 
ra Dra) Vrbs re, pr 


Eu Obr Drap Legis | 
| O(L, +120) | Із 


To calculate the numerator on the right side of (A.6.1.16) we proceed in the same 
manner as in the beginning of this section, go back to the scheme of partial differen- 
tiation shown after (A.6.1.2) and observe that the same scheme will serve, subject 
to the following modifications. Omit all columns below 15,..., 44, ..., 4, and below 
pl and all rows along X;,..., X, and along ky, kia, kag, -e-s ege, bop, Ls, Ky. TÉ 
now we make the same kind of calculation as from (A.6.1.3) to (A.6.1.9) we can verify 
that the numerator on the right side of (A.6.1.16) will reduce to 


(A.6.1.17) 27 П mt, 
fert+1 


and thus we have 


"E 
(A.6.1.18) 1X,4,— 27 TE qi Pray dus, 


A | OL, 41, Lus Lau Lir , 
| OL, rss) | ҮЙҮ ЫК, 
А-6 
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(A.6.2): Jacobian of the transformation (A.3.8). i.e., J(X4, X5: А. c's, Ly, 
Lor), where X (pony), Xo(pX ne) (p< m: no) are each of rank p, cs are distinct, and 
A is solid p x p non-singular with a positive first row. Putting c) = f; (i = 1, 2, ..., p) 
and using (А.1.1) we have 


(A.6.2.1) HOGS Aa рат e 


UG Xe, Lgl, Lyle) 
0(А. 178, ds Ly) ‚ Ёв, 757 Dar 


(Lap). 


Л 
9) n, 


To evaluate the numerator we proceed as follows. Denote, as before, the row vectors 
of Ly, Ly, Ху, Х by lii bi ху, X3; (6 = 1, 2, ..., p) and LL; by (kj) and LL; by 
(kaj) Then the transformation can be written as 


Чы) 
Lor 


“dy h ад 
(А.6.2.2) Xy = [li Ly] : + Xa [la -e lop] 
ay tp zi lip 
(= 1,9 у, ph or m full, 
im lip 0 0 а 
. аһ, 
0 
Xy 
арі 
Xp 0 1. 1, 
= > аы, 
йуу 
Xy lj, lj, 
Np 
Xop 
0 
ад 
0 E Tec 
— m gp m 
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The scheme of partial differentiation is given below. 


Act lY 


X, (Ig, t) (aisla) (а) 0 


edhe OS PATUD 0 (as) 
k, 0 0 (20 
‘ k, 0 0 ОС S 
where a’ = (tyy «Gp -ap + Upp), t = (f . ty). И 2 112i 
Xy Xn 
15 = (I le « lo, p1 6) X; = E » Хә = 
Xi Xop 
hu | En 
Tim T" П Е 0 0 
nh + dh. . 
kii kog | f 
k= DE. ha Ь 
` ` 0 Hees ral fete 
kup korp 
[E Esp p 
uhi . а Da4) Tie. D, rn) 
(ay, Ь) = : : 5 > (at) = 
ам. ајр - D, t 03) tips). c D, 0) 
etal T mE 20 D, - . - D, (M2) 
(L) = , (9%) = 
Qu oer b. D, 0) JE we D, ro) 


-1 
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е i AT s dat ОВО 
0 21, 0 21, 
О DO у) 0 
(5) = > (ls) = 
РОТУ nie I y DM Ue hr 
„10 RE 1, 1, p-1 0 d à p^ 1, 2-1 


We аге interested in the absolute value of the determinant of the above matrix 
(which is really the numerator in the Jacobian) and which is: {p?+-p+(n,+7)p} x 
ip*--p--(m-4-",)p]. After some obvious manipulations we can take out a factor 


2 fi 117? so that we have the whole determinant reducing to 
i=1 


Ma Ma My, 0 pn, 


Ma 0 0 My | ns 


(A.6.2.3) 2? fp m» Ч 
gis 0 0 Mg 0 | p(p+1)/2 
о о о My | pp--Uj? 
po p pm pn 
where 
T 31 9,0 S Mar RR ate 
Myy(pny x p?) = qu АД" р : 3 М,(рпх р?) = 
(eat ice MERE Dm t Bry ls 
э 
"gh. аһ D, (n3) G Day 3) 
My (pn, хр) = м: . ; Mia(pn, хр) = ; 
ы . а D, (4) . D, r3) 4 


(na) 


p 


D, (n2) : D,, 
My(pn; x png) = 
Dons). D, (n5) 
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) É 0 0 0 Fi 
xi: | ee хт) > е, ц : 
12"2 11"1 
р 0 bra Сү; Жу peers 
ы 0 0 0 
Ma (fee) x pn) = Pal: C vr 
0 lop 155-1 


Hence we should have 


(А.6.2.4) 


p(p--1) 
NS 


Е ‘| ui 0 | E 0 JE "d 
0 0 pp) 0 My 90 M, - Ma 0 
2 


My 0 [= 0 ithe 0 ] My El 
: о Milk о My о MglLM, o 
М»Мӊ [Mn : М] 
=| M| | Mes] ` 
T Mta Mai о] 


180 JACOBIANS OF CERTAIN SPECIFIC TRANSFORMATIONS 


It is now easy to check that 
Dau(m) . Darin (nj) 
(4.6.2.5) | My | — | 4|", | My] — | 4| "апа also | Ит! | = 
t Dawin) . Damn) 
and Mj is exactly of this form, each D being of n, dimensions. Hence we shall have 


Oh . Gt . ачи. а 


1р7 1p"p 
(4.6.2.0) Мам = 
Oh . аз. ат . a? yt, 
АВ 03 К 0 Dau(p) . Dan(p) 
ООО TRAN [os e ly, Daw(p) . Daii(p) 
= My D(suppose) where we denote the right hand matrix factor by D. In an exactly 
similar manner we have Mz} M, = Мр. Next we have 
ly £30 
(4.6.2.7) My Me = = Ny (say) 
ФУД, 


(using UU- = I(p)). Thus we have 


D 07 у? 
(А.6.2.8) Mig [Mj : My] = [MD : Ny»| = рт, i зы | 
: p p 


057 
p» p 
j D от у? 
and Mat [Ma : 0] = pna[ My, : 0] › 
^ 0 1-9 
pop 
80 that (A.6.2.4) now reduces to 
(A.6.2.9) VIZ +a My x My : | [ D 0 | 
Max Ma : 0 0 I(p) 
TA S d 
реки ie Jens that| D| — | 4 |» 
Max М»: 


and |Z(p)| = 1). 


— 
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Using L,L, = LL, = Қр) the structure and reduction of the 2nd factor 
(which is*a determinant) ean now be displayed and visualized by considering p = 3 
which will make immediately obvious the corresponding structure and mechanism 
of reduction for the general case, Below is given the case of p — 3. 


mod (6 —1$)( —1)((—15). 


In the general case this is easily checked to be replaceable by mod т (2—2), 
i<j=1 


so that, substituting in (A.6.2.3) and noting that /? = с. we have 


(Xi: Х,, LL. L als) 2p hr Ыр! | Hna iP adil п (202); 


(А.6.2.10) 


me, t, pow 1%) 4, t Iir A Qul ici 
so that 
p Ms д(Х,, ХА, LaL) 
(A.6.2.11) JO Х,: Ас. Lap, Lap = PO eae 
PE 2027 (Lele) 
9004) || (Lan) 
oP m+n PT е7 nod TL © S a(L,L;) (LL, 
zo rs rs, Yer ay) rz, | Lan) 


It may be noticed now that (A.6.2.11) is the Jacobian (ii) mentioned in the begin- 
ning of (А.б), ies (Ху, X» : A, e's. Lr: Loz) and (А.6.1.9) is the Jacobian (iii) men- 
tioned there, і.е.. J(X : gr), 1915397: 
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(A.6.3): Jacobian of the transformation (A.3.6), i.e., J(X : Mr, c’s, Ly), where 
X(p xn) (p < т) is of rank p, cs are distinct and M has a positive first row. Ву 
straightforward methods of exactly the kind used in the preceding subsection (A.6.1) 
which is rather lengthy it can be shown that 


T n—p—1 T 2 
(А.6.3.1) J(X : My, св, Lj) = 2" Ц с; ? mod П (¢;—¢;) 
i=l i<j=1 
Jam’) 


O(LL/) à 

(Lp) 1 1 
But a shorter proof of this result can be given by combining (A.6.1.9) and (A.6.2.11) 

in the following way. By (A.6.2.11) we have 


9(M 5) K T 


(A.6.3.2) (Ху, Ху: A, 08, Ly, Ly) = 2^|A 2275 fi c ? mod i (c;—6;) 


iel i<j=l 


001) [йг 
30,5) lr, [PT |” 


where X,(pxn) = АР zL, and Ху(рх®») = AL, 


Also, using (A.3.11), we put 


(A.6.3.3) A(px p) = T(pxp)M(px p) 


where M is |. and using (4.6.1.9), we have 


(A.6.3.4) А: М) = ibus ee 
ii Molu, 


Next put 
(А.6.3.5) Xi(pxn) = T(p ҳр) Х(рхт), M(px p)Lypx No) = М.(рхт,)(вау) 
(so that X = MD,zL,, X, = T M.) and note from orthogonality of M that 


(A.6.3.6) [4| = |?) = ї ы and М.М; = MI,U,M' = Қр). 


We thus have 


(4.6.3.7)  J(X,, X, : А,сз, Lay, La) = J(X,, X, : T. My, св, Tip Loy) J(A : T, M, 
= JG : X) WX, X, : P, My, ев, Ly, My) IMa : Da)]J(A : T, М) 
= HX: X) ДХ: My, c's, Lp) ЛХ, : T, May) TMar : DNA : T, м). 
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Now notice that 


(4.6.3.8)  J(X,:X)= |11" = П in, J(X, : T, My) = ыл 
i=1 i=1 2012) M, 
and J(A:7, М) =? i gi oM M) 3 
icl AM) M, 


_ Now to evaluate J(M,, : Г) we temporarily regard M as a constant but | matrix, 
notice that L,L; = I(p) is equivalent to M,M, = I(p) and now using (A.5.5) we find 


ML,, М.М). |0(M,—ML,, м,м!) 


(А.6.3.9) — J(M,r: Ley) = pisa 


Mop, Ly) 002,5, Ma) 
ar. — ML, мм». әм мт, D») |_|) | ә). 
O(M yp, Ls) "| 9, М) "Uf, | Los) | 


Now substituting in the left hand side of (A.6.3.7) from (A.6.3.2) and (A.6.3.6) and 
in the right hand side from (A.6.3.8) and (4.6.3.9) and putting L, = L (say), we have 
the Jacobian (A.6.3.1). 

(A.6.4): Jacobian of the transformation (A.3.15), i.e, (Ху, Xy : T, с, Ly, 
Lir, Lor) where Т is non-singular with a positive diag D and c's are distinct. Using 


(А.6.1.9) we have J(X, : P, La) = 2” i w 


ї=1 


Next, using (A.6.3.1) we have 


p—m—1 


eras LOT, ат L'L)| |o) 
J(X 7: Lj, ¢’s, L ЭТ Р 95 mod TE (ос ac As ik 5 
| Пу Шу Sea 05) |р, | 90» |р 
From these it is easy to check that 
p—n—1 
(A.6.4.1) Л Xov os, Ly Lg Da) = 28 Te “th Gg? 
i=l 
bee A(L'L)| | NLL) (LL) 
Ж d П. (6—6; di 
Е шжу (0055) | an) |n, 


(A.6.5): Jacobian of the transformation (A.3.17), ie. J(X4, Xo: T.U, св, 
Mir: Mar, Loz), when the c's are distinct, T is non-singular with a positive diagonal and 
U is non-singular solid with a positive first row. Using (A.6.1.9) we have J(X,: 7, La) 


zw me WAR 
it 


Us i L] is L). 
A-7 


E Next, notice that J ARDE LI) = 3 (sinoe 
(91,2) 
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Next, using (A.6.2.12), we have JOGLL; : D;] : U, ез, Му, М) 


(M, M2) 
9(M y) 


n—g—p—1 5 7 Й 
= 2710] fle ® mod T (66) / |2040) 
* el i (Min) ie 


Moy 


It is easy to check. by combining the three Jacobians, that 


n—4—p—1 

7 р c зр 323 
(4.6.5.1) ДХ, X, : T, U, e's, Му, Mar, Lp) = 2" ft R UA e, 

i=1 


i=1 


{б АБ) 


(Lap) 


A MMs) 
Mr (Map) Wm 


2-1 
x mod II (6—8), 
i< ј=1 iy 


Tee 


We recall from (A.3.17) that if €; = (1—¢,)/e(i = 1, ..., p), then the c;'s are the roots 
of the equation in c: [e(X,X1)—(X,X5)X,X;)-1((X,X:)| = 0. In terms of the c’s, 
therefore, we should have the Jacobian given by 


(A.6.5.2) JG, X, : T, U, c8 My, Мы, Dy) = 27* fl 0" 
і=1 
m n—p—q—1 / "®—ч+2 24 D M 
x КЕ 2 CIA mod "Il (ос) О) ОСИ M) 
i=1 5 i<j=1 (Mp) M, | (Msp) Mar 
x Laks) 
9(Lsp) |у, 


(A.6.6): Jacobian of the transformation (A.3.19), i.e., J(Xy, Xy, Xa : Za, Zip; 


Саа) 
ay) |z, 


Next we notice that J(Xy, Xy : Ay, Zio, Zor, Жы) = a(x ) : ( 2: 2) l || RE 
22^ Mag 


Za Za, T, Та). Using (A.6.1.9) we have JX, : f, Ly) = 2 Yr ai 
i-1 


: ZU. „ЖЕ ү 
since | A is |. Therefore it is easily checked that the total Jacobian 


ДОА) 
an) fr, 


(A.6.7): Jacobian of the transformation (A.3.14), i.e., J(X,, X, : U,, 02, Os, 
U4, c's, Lir Lor), where Xi(pxn) (p > 74) is of rank т, Ху(р хт) (p < ng) is of rank 
P, the c's are distinct and U. 1 has a positive first row and 0, a positive diagonal. We 
start with the transformation (A.3.15) and use the Jacobian result(A.6.4.1) and rename 


the symbols. The transformation is Х(рхт) = Tp Xp) L(pxn) D (n x m) 


T Р 
= 2 i 
t=] 
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хв Xm), Х(рхть) = Hp xp) Март) subject to L, being |, MM; = Ip), 
L/L = I(n,) and T being non-singular, and the Jacobian being given by 


Е p-n-l 
(A.6.7.1) Fixe Uf vs IU Mae itg ttes Il c, 
i=1 i=1 
pico (L'L)} |900,1) [@(М„М;) 
x mod П (e;—6)— |-74,—- collet AS кач 8 Ў 
ї<ј=1 / A(Lp) Lj 9p) UT 00) Mər 
Let us write 
Kı | p=% 
L(pxm)- : 
K, 0, 
ny 
K, | p—t 
To this L now, if we adjoin, as we could, a matrix such that K(p x p) 
a n 
p—n 
К, К, |р—% 
= is orthogonal (note that this could be done since Z/L = I(n,)), 
КК, M 
nyp- Ri 


it will be seen that the number of independent elements in K is the same as in L. This 
is verified as follows: 

In L (by virtue of L'L = I(n,)) the number of independent elements are 
pn,—n(n,4-1)2. In K the total number of elements is p? —(p—n)(p—74— 1)/2 and 
by virtue of KK’ = I(p), the number of constraints is p(p-l- 1)/2, so that the number 
of independent elements is p?—(p—™%)(P—™% 1)/2—p(p+1)/2 = pn, —n,(n33-1)/2. 
If we now put 


U, б, D ру T. 0 K, K, P= hny 
(A.6.7.2) О(рхр) = = È 
U, Using nyc Tue К, К,А, 


Ny p—, p—n,, Mm p— m 


(by examining the right hand side we note that the left hand side is really of the struc- 
ture indicated), we observe that the number of independent elements in U which 
is p?—(p—m)(p—%—1)/2, is the same as in (0, К), 1.е., as in (T, Ky, Ky), which is 
pip--1)/2--pny—ny(n,-1)/2. Tt will be shown in the next article (and we assume 
the result here) that 


~ ~ p—n. А (А 
(А.6..8)  J(U : P, К) = JU : P, Ly) = 2^ ft gee пм EOD 
i-1 i=1 O( Lp) 1 


so that, by taking the inverse, we should have 


—т 


(А674) JP, K,: 0) = JË, Lr: U) E ae ee 
i-1 


| (Lp) 


L; 


TAS 
SEDET ТУТУ 
i=l 
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Also if we put 
(A.6.7.5) Тр хт) = K(pxp) Mp x ng) 


(where by virtue of KK’ = K'K — M,M, = I(p) we have L,L; = I(p)), then exactly 
as in (A.6.3) (treating K as a constant _| matrix) we have 


д 901,213) 001,12) 
(A.6.7.6) JO, : Da) = |M] | OC i 
Из) |м„ | 9025) [ы ; 
Thus we have X ‹ 
7i 9(M,M;) eLL)| ?-m p—n-—i ? 

A.6.7.7. JUL LL MU. T. зад ПО (а) 
f frein dt 21) 9055) и, | D a `” 

он fl ipa OLD) р 

il 001.) » 
Using these and remembering that |U| = |Й | = йл we have 
isl 

(A.6.7.8) JUS. X, : в, U Ly, La) = WX, Х,: 6 T, Ly, La, My) 


x JP, Lr Мы: 0, Ly) 


p—n;—1 ni—l p 


n = ] М 
О ааа TI (=) П (uy P7 e [LL 
i=1 i<j=1 i=1 0001) L 
1 
(1р) ar } 


which gives J(X,, Xa: e U, D, Lir): 


Now for the proof of (A,6.7.3) with a transformation of the form (A.3.14) we 
proceed as follows. We start from (A.6.7.2), postmultiply both sides by the pxp 


I Ори 
matrix ; Where M is -L and then write 
0 М А p—n, 


т PM | 


U, OFT o U, б, Y, Vs 
(A.6.7.9) U= = = | (say) 
U, U.S Lo a U, U,M. Vena 


= V(say) 


х Г | d 
ESI KG = TN (say). 
Lo M ree 


m 
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Then using (A.6.1.9) and (A.5.5) and (A.5.6) we have 
А 


4 NI ME NN’) op % sa] XR?) 
(A.6.7.10) — J(y:T,ky— figs РО) әй 7, EE 
dn ip^ / |207) а, ia" |90) |x, 
pM i) 
(Mp) |m, 
and also 
(A.6.7.11) ЛЕО Мр" m uut m | e 
i=l / (Mp) M, 


Using (A.5.6) and taking account of the remarks after (A.6.6.1) it is easy to 


check that 
001) 
212) |р, 


Now combining (A.6.7.9), (A.6.7.10), (4.6.7.11) апа (A.6.7.12), we have 


puc) 
(Къ) 


(A.6.7.12) 
Kr 


Eg P pam at JAMM’) 
dV — 29" П (и)? aU dM, di 
— £ (tgi) Ei 1/ OM) |у, 
ee _|a(L'L) ММ!) 2 ; 
>»? i tg! dT dL,d M, = о л st ЕЕ . which proves (А.6.7.3). 
ay TU TU) jn, | LD) (и, d 


(A.6.8): Jacobian of the transformation (A.3.18.19), i.e., J(X4, X5 :4, By, 
Bs, B,, B,, c's, Lj), where the c's are distinct, A is non-sigular with a positive first row, 
B, has a positive diagonal and В = [i B] is non-singular. This Jacobian can 
2 4 
be derived in the same manner as in sub-section (A.6.2). We shall not need it in 
this monograph and so will not derive it. We merely state without proof that 


(A.6.8.) | J(Xy, X5 : A, В, с', Lr) = 2"|A [7| Ble TL (Ben 
i 


берес opel —1 , 
x fi (—«) ^? бг ной Ai (ci—6) / До) ^ 
б i<j=1 (Ly i 
Ly p 
where L=| Ls q and is subject to LL' = I(p-4-q). [31]. 
Dp 


APPENDIX 7 
Canonical Reduction of Certain Distribution Problems 
(A.7.1): If X(pxm)(p <n) has the probability law (4.13): 


(1/27) |X|5 x exp [— 4 tr Z^? ХХХ, 


then the distribution of the characteristic roots of XX’ (to be called €'8) could not involve 
a5 parameters anything except the characteristic roots of У (to be called y’s). 

Proof: Note that, a.e., XX’ is p.d. so that, a.e., all roots c(X X") are positive. 
Notice also that, a.e., they are also distinct. It is of course assumed that X is sym- 
metric p.d., so that all c(X)'s i.e., y’s are positive, Using (A.3.3), set X — Du, 
where иб is |. We have now tr E-1X X' — (аР) X X! = tr Dy, WXXUD угуу 
(using (A.1.5) and the orthogonality of и). Now put iX = Y or X(pxn) = Mpxp) 
X Y(pxn) and observe that, by (A.4.1). e(XX/) = «(Y Y^). and tr Diy WXX' = 
tr D,,YY'. Also by (4.5.2), J(X : Y) = |и|" = 1. 


Remembering further that [2] = |д|® П Уг. it is easy to check that Y has 
i=1 


D 


the probability law: 


m 


(A.7.1.1) [uen ti x] apl Юю, rrjar 


which, in view of the fact that e(X X") = с YY’), proves (A.7.1). For the distribu- 
tion of e(X X"), therefore we can, without any loss of generality, start directly from 
the above form of probability law which is accordingly a canonical law for this purpose. 


(А.7.2): If Хурх m). XA PX n) (p< N,N) have the joint probability law: 


pni тз) ny пә, 


[um) 2 PIT Dal ]екр[— be Gira c exa x ax, ax, 


> 

E, and У, being each symmetric p.d., then. the. distribution of €(X4X1)X5X2)71) (to 
be called св) could not involve as parameters anything except the (2,2>1)'s (to be 
called y’s). 

Proof: Notice that, a.e., (X, X1(X, X2)3) are positive and distinct. Since 
У, and X, are each p.d., use (A.3.4) to set E, = Dyp and Ж» = ии", where Lis non- 
Singular and all y's are positive. We have now, using (A.1.5), tr рэ ер «8 
tr Dy uM XV Xip’ and tr EQ1X.X;—tr MAXX’, Now put 71 X,— Y, and ИХ, 
= Ys be, X(pxm) = д(рхр) Үр Хт) and Xi(px nq) = Ар Xp) Y(pXn,) and 
observe that, by (A.4.2), (X ,X13GX3)71) = (ү, Ү;(Ү,Ү;)-)., Also, by (A.5.2), 
У(Х, X, : Y,, Үз) = [u|vitna, 
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Remembering further that |YX,| — |ø |? it у; and |X,|=||?, we check that 
t=T 
' Y, and У, have the joint probability law: 
pin; p mi 3 
(A.7.2.1). Шел) 2 yè | x exp[—4 (D, Y Y; + Y,Y2)HY, dF», 

i=1 J 
which, in view of the fact that c( XX; (XX) *) = (У, Y;(Y5Y;)-?), proves (A.7.2). 
If we are interested in the distribution of these roots, i.e., of the c’s we can, without 


any loss of generality, start right away from the above form which. for the purpose 
of this problem, will thus be called a canonical distribution law. 


Ху ]р 
(АСВ) X -| | ] (p <q, p--q < n) has the probability law (4.15): 
— 1 


n 
non T du РО РЕ ВР. à 
[1/(27)0*02 |X|2] exp 4 —4 tr ] [X; — Xj] | [ aX, dX,, 
\ - Хз — Xs X; 

Ху XQ] 4 y у еф, 
where 3 = is supposed to be symmetric pd., then the distribution 

pm Xs q 

y g 


of e[(X4X1)1(X4X3)X$X5)3(X,X1)], (to be called c's) could not involve as parameters 
anything except с. Z122" Улз) (to be called y’s). 
Proof: Notice that, a.e., the p c’s are positive and distinct and also that 


y's are all non-negative. Use (A.3.16) to set У, (рхр) = mlp x p)ustp Xp), 


Xa(qx4) = na xq) (4 Хч) and (рх) = mn(pxpED,s 0] (pusaxa). 
р q—p 


where ш and jj, are non-singular. We have now 


zi 
Xa Xj] fide 0 Tp) ID» 0] Hn 0 
(A.7.3.1) Xi— = ' [Dus 
Lie Dee 0 дет! L Ü Iq) 0 uz! 


Also 
Ip) [Dys-0] Др) 0 


(A.7.3.2) D s = D, D у=» 0 
Iq) 
0 >; 0 0 I(q—p). - 


1р) [Diz 0] 


x ee 0 ] 
0 
0 l(g—p-- 
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where we notice that, on the right hand side, one matrix factor is the transpose of. 
the other matrix factor. Taking the inverse on both sides of (A.7.3.2) we have 


1 


й 3 СОО ee Oa] fp LC) A MT ET 
(A.7.3.3) ША a [ Dima 0 ] 
109) 0 
0 a 0  lI(g—p). 
I(p) 0 


х bid Br 0 | = M(y)M'(y) (say). 
0 UP Gea lcs 


Taking into account (A.7.3.1), (A.7.3.2) and (A.7.3.3) and using (A.1.5) we have 


x u o X 
(A.7.3.4) tr x | Й [X| X] -— My) | А ] | d 
X; 0A - CX, 
Ў Hest Ay) 
хх, Xo] Л My) 
О fs 


Now put j!X, = Y, and pz! X, — Y,, іе., Xi(pxn) = ш(рхр) Ү,(рхт) and 
Хт) = (ха) (ф хт) and observe that, by (A.4.4), (XX) NXX; X-X; 
X (XX) = (Y,Y)7(Y,Yj(Y,Yj-(Y,Y;. Also, by (A.5.2), ДХ, X, : Y, Yo) 
= [nl lur. 

(р) [Dyz О]: 
Next check that |X|? = lal" |a |^ [Ds = [nius Пру, and 

100) S 
0 


e 
finally check that (Y,, У,) have the probability law: 


cian * Y. 
(A.7.3.5) [uen ft nn] exp E jr My) | 1 | 
i= Y, 


SEK Yaan | dY, dY,. 


In view of the fact that «GG X1) ХХ) ХХ ец Y,Yj Y, Yj(Y, Yi 
x (Ya KD] the probability law (A.7.3.5) proves (A.7.3).. Thus, as in the two previous 
sections, if we are interested in the distribution of these roots, i.e., the c’s, we can, 


[ 
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without any loss of generality, start from the probability law (A.7.3.5), which is thus 
a eanonical form for this purpose. 


X, p у 
(А.7.4): If X=} X, | q(p<¢.p+atr < n) has the probability law (4.15): 
X, т 
n 
Eg X. 2з 7 P 
mam n 
| uen г ЕЧ өхр[—1 Z3XX' MX, where 5 = | Xi Za Xs | 7 
Xy X, Xs =" 
p q 


is supposed to be symmetric p.d., then the distribution of the c's could not involve as 
parameters anything except y's where cs and y's are respectively the characteristic 
roots of [X,X,—X,X,(X,X5)" "ХХ [X3X2 ОХ) X3 X5 ][A 2X9 — X,X; 
X(X3X5) X, Xe) "Xs Xi —X, X (XX) ре Папа [24 —2 45233 Xs = У-У Zas] 
X [222 — 5 os) 111—558 Уз]. 
Proof: Notice as in previous section that, a.e., the c's are positive and dis- 
tinct and that the y’s are all non-negative. Use (4.3.20) to set 


MT [D : 0] | КУ 


“Уң Lp Уз “Hi Оз RU 
ENTA Dj; | 
(А.7.4.1) Dye Xa Xs | =| 0 миа I 

———|— 0 | 

Xia Des Des 0 0} ps te 
> 0 | L2 

WM; 0]-0 
x 0 Ms 0 Р 


Hs Ma %- 


and check that 


fy 0s 4 Vy 9| Уз 
(А.7.4.2) O д | is of the form 0 vo! v 
0 |% 0 1$; 


» 
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Proceeding as in the previous section we have now 


х 
Ta Die 
M 010—1 Dom 0 
: 2 К д 0 | 
А.7.4.3 роо 0 E > 
( ) Ap [сү ral 
ae | 
Ms | Б » Ir É 
j 0 zl 
| т 0) fg у 
D 1—7 [D iiy 9 | 
xs d on 0 O sap | =w M(y)M'(y)u (say), 
бу Ка б ru 
| ШАУ, 
0 1 
and thus, as in the previous section, 
-— X; 
(A.7.4.4) Pe ee. келч 
' E. 
-X, 


= w[X; X, XjuM(y)M'()u X, 
Ха = 
22$ Ja 0 qs Y, 
Nowset | X, = 0 fy Ma Y, | and note that the c's are invariant 


X; DEO Uc Ey- 


n n 
under this transformation. Also ЈУ) =| |" and|X|? = ||" ft (1—у,}?. Thus, 
i-i 
finally Y has the probability" law. 


i 


Morath o n 
(А.7.4.5) [iem 7 Di auf | 


X exp| 250) Y, IY; Y; ҮМ) aYargr, 
Y, 5 


which proves (A.7,4) which we take to be a canonical form. 


—9À 


-— 


с 
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(A.7.5): If X (px) and X,(pxmg)(p < mg but might be < or > m` have 
he joint probability law (4.21): [1|(2m)p(+n2)/2| Z| (5) 02] exp [—} tr E3(X,X; 
-+(X,—E)(X{—2’)}]dX,dX., where У(рх р) is symmetric p.d. and E is px n, then the 
distribution of c(X,Xj(X,X,)~), to be called c's, could not involve as parameters any- 
thing except c(££'2;1), to be called y's. ; 

Proof: Notice that, a.e., out of the p c’s,r are positive and p—r are zero, 
where r = min (p, »,) and also that y’s are all non-negative, and, out of them s are 
positive and p—s are zero, where s < min (р, n4) is the rank of Ё, i.e., of £p ni). 

: Assuming, as we can without any loss of generality, that the last s rows of £, 
i.e., the last s rows of £2’ can be taken as the basis, use (A.3.13) to set 


t p—8| i a XE фу 
(А.7.5.1) (6) хр) = Ру (sxs)[u дә] and 
8 L fly p—s s 
8 
m ds] [ui ues 
X(pxp)-— ЯРАТ т 
Ша Hid Hs #4- p—s 
p—s s 
та Its р 
where и = and ўз are non-singular апа D* stands for the diagonal matrix 
tHe Ш 
with s (non-zero and here positive) roots. If we now set 
#108 9—58 [m $ 
(A.7.5.2) E(p Xn) = = D's(sx 8)v(8 X n), 
čo e САУА 
т 8 


1 


cU 
it is easy to check that v is determined by v = D'ugi & and that уу = I(s). Let 


Dy. 1078 
Dy(pxp) = E 
0 0J p—s 
s p—s * 
Recall that s < min(p,7,). Recalling now that tr X,£' = tr £'X, and using (A.1.4), 
(A.7.5.1) and (A.7.5.2) we have 
(A.7.5.3) tr Z(X,X;--(X, — ENX — E) = tr HX X, +X, X — 2X" 
х D* al в]. 
Now using (A.1.7) complete У (т, x s) into an | (л, x n4) and rewrite the right 
D* уя ‘| s 
(OS 0 


‚ / s s—p 
xw(pxp)-uD,u'n-. Put now uX, = Y, and pX, 8 = Yj, ie, X,(pxn,) 


hand side of (A.7.5.3) as tr p (X,X;4- X,X, —2X,(p x m) O(n, Хт) х | 
n—8 


E 
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= u(p xp) Yi(p xn) 0(n,x 1,) and Xp x nz) = u(pxp) Ypx na) and ee aa that 
by (A4.3), (X, X(X,X2)2) = с Y,Y{(Y2Y3)4) since is non-singular ani ð’ is [is 


Also, by (A.5.2), J(X, X,: Y, Ys) = |ø". Also observe that |X| = |u|. ^ 


Finaly check that (Y,, Y») has the probability law 


B 


pn, тз) 3 
(A.7.5.4) 5 2 ) ехр [rye Y, Ү;—2Ү(рхт) 


D* 3:0 ] 
x } 4Ү, dY, 
0 0 


which, in view of the fact that (XX (XX) = (Y, ¥i(Y2¥5)"), completes the 
proof of (A.7.5). We also note as before that for the purpose of any discussion of the 
distribution of c's the probability law (A.7.5.4) can be taken as a canonical form. In 
(A.7.5.4) notice that 


D*» 078 B i П 
(А.7.5.5) tr У (рхт) [ ] = (Ууну tr D, = Xy. 
0 - 0-lm—s ex i=l 
8 p—s 


Using (A.7.5.5) the canonical form (A.7.5.4) can thus be reduced to the more 
convenient form 


(A.7.5.6) [1/27] ^? exp [a Y, vie Y, vae 2-20 Ууу аат, 


The reader must be cautioned against stretching any further the theorems (A.7.1), 
(A.7.2), (A.7.3), (А.7.4) and (A.7.5). For example, taking (A.7.1), suppose that 0 < о, 
< %< ..<с„<оапа 0 < Уз... < y,« co. The joint distribution of сгв could 
not involve as parameters anything except y,’s and, in fact, it does involve all 
these parameters. But it must not be inferred from this (and it is not true either) 
that the distribution of €; involves just y,, with i = 1,2,...,p. In fact, the distribu- 
tion of any c; will involve as parameters all y,’s. Nor do the distributions: of the 
usual symmetric functions of the c's involve, in general, the same functions of the 
ys. The same thing is also true for (A.7.2) — (A.7.5). 


е 


APPENDIX 8 


Some Results in Integration 


i-i il, i=1 


(A.8.1): | ПП! — П r(2) at ns 2: L1) I a. 
y #=1 Ji ! 
(аа) < 1 
where 2; > 0 and фр, q a, > 0,1 = 1,2, ...,n. Ап important special case is where 


.; = r, p; = 1 and 9; = 2, in which case we have 


. (A.8.2): | Tl de; = [r(3J] mjor( = +1). 


Ў 2? < 7500,20) 


il 


n f 
If, however, we integrate over х= inthedomain X a? < 7?, after dropping the restric- 
i=1 


n o 
tion that v; > 0, i.e., if xs could take both +ve and —ve values, subject to DE 
tel 
then from considerations of symmetry we shall have 
n 


1 (A.8.3): f it dx, = [r(3 )] eim). 


i=1 


Differentiating the above on both sides w.r.t. r we have 


n 


(A.8.4): f П de; = a| Be )] "аг 41). 


r< ( Bap ju < r+dr(r>0) 


i= 


(4.8.5): | П de; = LL dr(sin 6y-36 
ч r( A 
n 1/2 
r< (Za) <r+dr(r > 0) 


ї 


o< [$ra B4) (Set) ]< 00 
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n n E 
Proof: Make a transformation y, = У xa; У а) —r* соз 0*, say, and 
ia ii 


Taja asa... а, |а] 


^ Ha [ээ .. fon 
Yi = X [%®у(# = 2, 3, ..., n) such that 
j=l 


Еп fn e Morin 
is an orthogonal matrix, 
Then using (A.5.4) and remembering that J(x : у) = 1/J(y : x), 
we shall have П dao; > П Фу, ie., dy, П dy, We have furthermore y yi = yi 
i=1 i=1 i=2 i=1 
n n n n 
+ 2 00 = у аф=т®, so that X yea ПЕЕ =r" sin?@*, whence (y? } 
i-a ii i-a i-a 
=r* sin 0* = u* (say). It is easy to see that the domain: r X 7* < r--dr and 0 < 0* 
< 04d, is exactly equivalent to u S u* < u+du and v SX yi X 04-40, so that 
n 
(A.8.5.1) Í П de, = | II dy, 
t=1 i=1 
rXr*xrLdr v < у, X v--dv 2 
0<0* < 04-10 u X u* Cutdu 
` n 
= dv П dy, 
i=2 
n 
u < (Eli < u+ du 
i=2 


= (и) Гр)" "2и eS ue 1) (using (A.8.4)) = (n— DTG) -idp 


in 0)"-а0јг ("—! 
х (ein өусыөүг (= F), 
which proves (A.8.5). Notice that Yı = ?* cos 0* and u* — рж 


sin 6*, whence J(y, 
W*:ir*,0*) = г, so that the dyydu*—r* dy* qg*. 


(A.8.6): The integral f aL || ar 5 = F(p, n) (say), where L is pxm 
D T 
"=1Ф) 
with р < n. This can be ev. 


a'uated directly but we Shall use an artifice to derive 
this. Consider the integral 


ph 


(A.8.6.1) | Gem jexp -16 ryqay, | 
T 


Li 
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where the elements of Y(pxn) (р < n) vary from —oo to оо. It is of course known 
that this. "integral is equal to 1. 


Using now the transformation (A.3.11) we have Y(p xm) = (р xp) Црхт) 
under LL’ = (р). Notice that almost everywhere Y У”, and во 7 will be non-singular. 
The t;s vary from 0 to oo and {ув (i Aj) vary from —co to oo. Observe that 


YY’ = Pj" and tr YY’ 5 fj. Using now the result (А.6.1.9) we have 


i,j=1 
, (А.8.6.2) Ths f [uem*] exp [-2 te rar 
^y 1 Е 


=I(p) 


p P ч 
x [f e [+ $ ШЕГУ й а. 
i»jj-l ji" i»j-l 
0 & 8 < oo 
—00 < t,j's < со 
(3 
But the last integral on the right hand side of (A.8.6.2) is easily evaluated to be 


2-»&miqpa-2a fT г[ cmd ] . Hence we have the following result (to be repeatedly 
i=1 


used): 


9(LL/) 
201) |, 


(A.8.6(3) F(p, т) = f dL, 
=I(p) 


= nepon / dt rf =i) is 


i=1 


Another integral that is useful is 


(A.8.6.4) f а, 
LL’ =I(p) 


(Ly) 


EA | - Fiat em 
T 


where L is px n, p < n and where the first row of L is to be non-negative. To evaluate 
this we consider the integral 


25. 
2 


(A.8.6.5) [ [iem | exp Е tr ҮҮ'|аү 
3 x 


where Y(pxw)(p < n) is such that the elements of the first row vary from 0 to оо 
and the other elements vary from —co to оо. Now using the transformation (A.3.11) 


2 
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we have Y(pxn) = 7(pxp) L(pxn), where LL! = l(p) and the first row of Listo. 
be positive. Then proceeding exactly in the same manner as in the previous case 
we should have 


(A.8.6.6) Bp, n) = 2- cR ieu А 


i=1 
or in other words, 


(A.8.6.7) F,(p,n) = 2-"Е(р, n), 


and hence 
(A.8.6.8) F(p, p) = 2-?F(p, p). 


Another integral that is useful is the one that arises out of (A.6.1.18), namely, 


Pratik, Daplis) | 
A.8.6.9 dL, TEIP nel, ““r+1p 10) _ À 
DEN [зеш eye a, 


Тр 


where thé variables L,,, are subject to constraints among themselves and also 
in relation to the constants L,, which are described by (A.6.1.15). The integral 
can be obtained by going back to (A.6.1.18) and equating the integral of the left side 


over (x; X)! < 1(i = r--1,..., p), and the right side over (X tz)! < 1, subject to 
j=l 


ti > 0(i =7+1,..., р) and over L,,,. By using (A.8.1) modified by using a factor 
of 2 to allow for each ty = 1,2,..., i—1 and i = r+1, ++, p) to be both positive 
and negative we observe that this leads to the equation 


nip ғ) (w-rXp+r-1) 


(A.8.6.10) m’ "(а 41) =r = терр (55$) та) 


X the integral (A.8.6.9), 
B 


whence it follows that 


(A.8.6.11) f Фу | |ы ы als) 


+1,р1 zr 
^ OL, | 
pu ( pb) І; 
n-r) (p-rXp-r-1) 
= PURI A efesien, 
ier 2 


Tt is worth a careful notice 


that this result is independent of the set of given 
constants L; ,. А 
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(A.8.7): The integral {exp [—3 tr 0010 aU, where U(p xp) has its ele- 


ments varying from —co to оо. Using the transformation (A.3.11) we have U(p xp) 


= T(pxp) L(pxp) with an orthogonal L. Notice that |U|= |7|= I ta and that 
t=T }; 


A 2 
almost’ everywhere 7' is non-singular. Also, as before, tr UU’ = X 1$. Hence we 


i2j-l 
“have, by using (А.6.1.9) and (A.8.6.3), [31, 32], 
„(А.вл.1) | eot vou = з” [ou | gU 
U Шир) Lp) іш 
5 р s 
x exp[-3 X jigri И dt, 
J PI ЕРТЕ ш Sie 
0 € fy — oo, 
=. < <% 
G 2 3) 


T ара) vp i г (set fi г [en |р 


i=1 ^ 


Since U(pxp), (with a positive first row) = T(p x p) L(pxp) (where LL’/ 
= I(p) and L has a positive first row), therefore, from (А.8.6.8) and (A.8.7.1) we have 


(A.8.7.2) | exp [—1 tr UU']| U| dU 
U(with a positive first row) 


E iud [2 й г fee) it г а =) 


p-n p 
(A.8.8): The integral | exp[—4 tr UULU]. СА av, 
U 


(RU 


U, U, 
n p—n 


p—n 
where U(px p) = | ] , where the first row of U, and the diagonal 


of С, vary from 0 to со and the vest from —co to co. 


V, їз | p—n Cy U. 4M 
To evaluate this integral, let V = zs 
2 


Vid n U; U, M 
n p—n 
= SERO 
E D | Ue 7] | ] ‚ where M is | and with a positive first row. 
n LU, U; M p—n 


крл IE E Ai A 
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Then we have VV’ = UU’ and || = |U| and 
IV : U) = J(Vs, V, : Us, My, U,) = ДР, : Oy, My) ДУ, : Us) 


=з = MM’) =A es 
= ә" n—i| |” 2r Ir eu n—i 
ars Mi. ae | 
Thus we have 
(A.8.8.1) Í exp[—} tr VV']| V|sdV 


Lá 


| (MLM) | 
| 00) и, 


=? "| ера vvv] (бур! 40 xf ам, | |20020) 


U 


Now substituting from (A.8.7.2) in the left hand side of (A.8.8.1 


(A.8.6.3) and (A.8.6.8) in the right hand side of (A.8.8.1) we have, [31], 


aM ) | и; 


) and from 


(A.8.8.2) | ех—4 0070 е7 Dg au 
U i=1 
_ (p—n)(p—n--1) д L 
= ЖКА, 4 сы: 1 p—n—i+1). ti p(p—itl 
a т =) ir ( 2 *) =f | 2 |. 


eS 


APPENDIX 9 


Some Results in Integration Connected with the Distribution of 
the Largest and the Smallest Roots 


“(A.9.1): Evaluation of the integral 


2 m, үп, 
$ iv ӨРТ й 4 | 25 =r tli (Lay)... аү'(1—у)'' 
| | | i=l 2 Ms-1 (| p Ps- 278-101 n 5-1 Ms (1. m yea 
o %=0 %1=0 23=0 2,—0 2. ( —2,) v —l (1—a,_1) sav 3, ( 1 
wa) oe (La yn ea lay | 
My; My T, ase +: My, N 
=f E; m,, n4; т, 1,1,1; «+3 My, Ny] (Bay) =] = аад 31 
Mys Mg Т1, ++» My, Ny 


This n-i see My, My Ut 
say). 


The last expression is in the form of a pseudo-determinant whose meaning is made 
clear by considering, for illustration, the case of s — 3, for which 


Migs Ng My, Ng My, Ny 
(A.9.1.1) Р: Mg, Та ә, Та — Tha. Ny 


Mg, Ng Жо,» Mys Ny 
æ gs = 

= f 22? (1—25)"* dar | f 22 * (1—a,)"? day f жү (1— x)" dz, 
0 0 0 


wo 


v3 
— | ataca de, [ т 
0 


ay)" da, 


2—3 


x 23 
-— f xg (1—25)^? da, IE 2^ (1—25)^5 da, 
0 0 


23 E 5 
f x23 (1—25)^! da, [ 215 (1—a,)"* da | 


ә 0 


2 ©з Ez 
| аа)" а, { аааз) бл | dai da, 
0 0 


— 
ES 
юз 
ios 
g 
R 
pa 
a 
2 
3 
— 
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In opening out the pseudo-determinant it is very important to stick to the 
order of the factors, indicated in the expansion on the right side of (A.9.I.T) and to 
keep in mind that the factors are non-commutative. It is also clear that the whole 
expression will be zero if any two columns become equal in 


Ms, n, а]. Тұр --- My, nı 
Mg, №, а Е ат 
m, n, Tay 06. My, MH 


Next we use the notation 


(A.9.1.2) BUG} m,, n; Mii 0,3 ss } My, 0) 
zx Tg 
= f a? (1—a,)"* de, f ai (1—2, 1) tde, 
0 0 
Ta o 


| afta" de, [ afr (1—2,)" da. 
0 0 
so that f(x; m, n) = [at '(1—a,)"da, = the incomplete f-function, using a slightly 


different notation mee the usual one. Also let a"(1—z)' = f(z;m m). In 
terms of (A.9.1.2), the expression (A. 9.1.1) can be rewritten as 
(A.9.1.3) В(2:; ma, Ng; Ma, Noi My, т) — (20; ma, Ng; ту, Ny; Mo, No) 

—/#(@; Mg, ig; Mg, Ngy My, m3)-- B (25; ma, Ng; ту, Ny; Mg, Ng) 

fé; My, ni; ma, Ng; My, ng) — Fe; My, my; mg, Ny; ma, Ng) 


and (A.9.1) can be rewritten as 
(A.9.1.4) DEAS my, nj; турт, 5..5 mp M), 


MT VADE (d у 
where (m,, 4) (m; 4. №1)... (Mi, %1) is any permutation of (m,, ne), (m, 1. ЖО), 
(1, у), the summation is taken over all such permutations, the positive or negative 


sign is taken exactly as in the usual expansion of a determinant, care being taken 
to preserve the order of factorization from v, through 2,_,, z, , down to 20. 


(А.9.2): Lemma: 


Гата —x)"fle)de 


20 


to 
= mpap О Se) [ snae folem |. zaflada], 
0 


0 


ч 
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where m, n > —1, xy <1, and f(x) is such that f'(x) and the three integrals on two 
__ sides of (A9.2) exist. 


Proof: The proof can be carried out by integration by parts, the integration 
being with respect to the function (1—2)"*" and the differentiation being with respect 
to the function f(x)” (1 —2a)". 


(A.9.3) Lemma: X P(x; m; m, 


8 
‚; mig, npa 5 Mis т) = IL Bw; №, m): 
t=1 
where on the left hand side (m;, т). ..., (mj. ni) is any permutation of (Me, n,),.... (m, m). 
the summation is taken over all such permutations and where the factors on the right 


hand side have been already defined. 


Proof: "The nature of the proof will be evident by considering, for simplicity 
of algebra, the case of s = 2. We have 


v 25 
(А.9.3.1) ( ad (1— a)" da; | a? (1—a,)™ day 
0 0 
2 аз 
+f аав | ame) de, 
0 0 
w wa а . c 
= | (1 —aty)"? de f a^ (1 —, dex, + | ai? (1 —a,)"? dur | a 1 —2, "dz, 
0 о D E. 


(which is obtained by interchanging, in the second term on the left side of (A.9.3.1) 
the variables x, and x, and rewriting the domain of integration in the appropriate 


manner) 


РА 2 
= f ant? (1—2,)"* das | CART =)" da, = B(x; mg, т) B(x; my, ny). 
о 0 


(А.9.4): Lemma: X fci m, as 7,1; eei Mp 75 Ms M; My ys Mya +: i Ma, Mr) 
Uu 


= p(w; m, m) Ble; m, as Mai eei Mas n4), where p, is the result of putting (т, т) in the r” 
place and filling wp the other positions with (m, 4. my 3); (my a.m). +; (Ma; т)" running 
from 1 to s. Notice that each В, is am s-fold integral, while B(x; m, 4, Т-р) ++» Mis m) їз 


an (s—1)-fold integral. 
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Proof: The mechanism of the proof is brought out by considering, in рат” - 
cular, the case s = 3, where we have 


(А.9.4.1) Px; Mg, тут, m3; т, т) 808; My, na; m, т; My, 3) 


+ Bala; m, n; My, My; m3, Т) 


23 


w 
= | aga" des | аав, | аума 
0 0 


e 


a 23 
+] 22 (1—25)* dat, f 22 (1—2) da, | zT (1—2,)"dx 
0 0 bz] 
æ 2з a 
+ | ae" de, [| aae des | аа) 
0 0 2s ^ 


(by interchanging the variables and suitably adjusting the domain of integra on 
= B(x; m, n) f(x; ma, m; my, т). 


(A.9.5): Lemma: 


Mas Ng A Та 
r- y toe nv 
2:(—4) Be | =; жу... Mi Mi 
ы Mis ng .. My, NI 
Mgs №... My, Ny 
= 7—1 A + ГА T 
7 xc-1) P(E; Th pass т) Bre | s S MT 5 У 
т, e. My, Ny 


where f,[";...] on the left side is the result of replacing the 1? row of pla; ...] by 
(mj, mj), ..., (mj, mj) and f, o; ...] on the right side is the result of suppressing the т 
row and r column of Bix; ...]. Notice that p, [m ; ... Jisan sxs and Py [v ; ...] an 
(s—1)x(s—1) pseudo--determinant. 
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Proof: The mechanism of the proof will’ be made clear by considering for 
„simplicity the case of s = 3 and picking out from the expansion of each pseudo- 
_ determinant on the left side of (A.9.5) (for the case s = 3) the term involving the index, 
say, (m5, n) and putting together all such terms (with index ms, ng). We have thus 
the following contribution from such terms 
(A.9.5.1) B(x; Mg, nis; Mg, Nasi My, т) — P(E; Mg, Ng; My, 105; My, Ny) 

P(e; т»; Ms hs; My, 1) — (v; my, 35 Ms, ngi My, Tt) (0; 7,75; тп; Ms, Ny) 
—#(@; my, My; My, ng; тл) = fs та, Ng)[ P(x; Mg, Na; m3, ту) — (95 my, ni; т». ng)] 


Mo, Ng My, Ny 
(AT 
= (0; mg:n) В | =; 
Ms, No My, Ту 


Mg, ng Mg, Ng My, Ny 


{using (A.9.4)) 


= f(x; Mg, ng) Bu Ti Ms, Ng Mo, No My, Ny 


Mg, Nz Mo, Ny My, Ny 


(using the notation introduced in the beginning of lemma (А.9.5)). This immediately 
shows that if, in the general case, from the expansion of each pseudo-determinant 
(with the proper sign) on the left side of (A.9.5) we pick out the term with the index 
(ж, nj) and add together such terms (with the same index (mj, »;)) we shall have the 


following contribution 


Tuy ss My, Ny 
(A.9.5.2) Ble; mp m) Puj v; 


m, 


gs Ng e My, Ny 


whence the proof becomes obvious by combining different expressions like (A.9.5.2) 
involving the different indices (m, n) (r= 1, 2, ..., $). 


(A.9.6): Reduction and evaluation of the integral 


Mon Mph = mam 


B| 2; ^ : ems 5 Р 


Tu. uus ees Myn 


where m > т>... > my > —1 and n> —1 and the m’s differ by integers. 
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We have already seen from (A.9.1.4) that the pseudo-determinant can be ex- 

panded into X-- (ж; m;. n; ...; m,n) where (mi, ... ту) is any permutation of (m, 
+++; Mı), the summation is over all such permutations, s! in number, and the positive * 

or negative sign is to be taken according as it is an even or an odd permutation, Re- 

"calling from (A.9.1) that 7 will be zero if any two columns of the pseudo-determinant 

are equal, let us try to reduce m, to m, by successive integration by parts. ` Toward 

this end consider the typical term in the expansion and in that term let m; be the largest 


exponent = m, (of course). To reduce this exponent by 1 we proceed as follows. By 
definition 


x 
] 
(A.9.6.1) (x; m,, m; ...; тыл, n; my n; Mig Ni sss тул) = f x,  (1—2,)" ах, 
0 
Urta y 2741 Er ; 
m. 
. f rst (1—д yy)” depi | aps (1—а,)" der, | атча, 1)" а, 1. 
о 0 0 
Ta 
f gy (1— a)” dæ. 
0 
Now using (A.9.2) we have 
Vri Er У L1 
D т, : 
(A.9.6.2) [ amaa de, | wa, ade, vus | er Lay de, 
0 0 0 
2р1 
=] ay" (1—ay)"da, (а; Tha, M3 ...3 M, n) 
D) 


1 
m т.п (La 44) Ao, i; ту, 035... Mi, n) 


TH 3 


+ a7 (1—х,„)"+18'(ш,; M4 n; ...3 mi, n)de, 
0 


Vrsi 


+m, f vois 7a yf, My 4,03 uus т, naz, | 
0 


pns Im; 
тп [— 2). Blas; Ty yy; «634, т) 


+B p33 mz atm, 9n-4-Yoml sim; Ls m, n) 


+ mf(ns sm, —1, m; mian; sus mi, т) 


—— A a ig 
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(notice that BAGS mh. mima m) a -(1—a)" f(x; Myos %;... ту, m) and also 


* that on the right hand side of (A.9.6.2), the first and second /’s are each an (r— 1)-fold 
integral while the third £ is an r-fold integral). 


„ Now substituting the right hand side of (A.9.6.2) we have (A.9.6.1) reducing to 


* 1 , ‚ с ; 
й (А.9.6.3) LER [—Blx; mj, ^ie M, tMg, 24-1; m, 4, N; ..., m3, m) 
* 4 -Ef(e; т, n; ...; туу, n; Mp tH, 2n4-1; m; s, n; ...; mi, n) 
. +m, Bla; mi, %;...; тд, my, т; ssim, n)], 


where the first and second /'s are each an (s-1) fold integral while the third f is an 
s-fold integral with the index m, reduced to m,—1. It is easy to check through 
(A.9.6.1) to (A.9.6.3) that the reduction to (A.9.6.3) holds for r = 5—1,5—2, ...,2. 
If r = s, it is easy to see that (A.9.6.3) will be replaced by 


1 , 
9.6. БЕБЕ 1 С resa TAS TO 
(A.9.6.4) IIT. Boa; m,, n4-1) f(x; m; 4,0 mi, n) 
E Bla; mitm, 2n--1; m; 3. n; ...; mi, m) 
+m, f(a; m,—1, n; m, 4, n; ...: т, n)]. 
and if r = 1, (A.9.6.3) will be replaced by 


1 


EC [—f(x; mi, т; ...; mg, n; ma--m,, 2n-+1) 
8 


(A.9.6.5) 


Lm, f(a; m;, n; ...5 mg, m; m,—1, т)] 


We can now use the rather convenient notation 


р B B . t 
(A.9.6.6) Blac; mj, ®;...; mua dom, 21; Mp: %;...; My, m) 
‚к stan! E A omn y PR 
= (0; m;, т; ...5 тур mi My, NHL; mias %;...; у, n) 


where (iy: n+l) is supposed to be added to the (элуу, n) on the left so as to reduce 


the integral by one dimension, 
(A.9.6.7) Bolas my, т4-1) (ж; mi as ®;...; my т) 
= — м 
= Bla; m, n+l; my n; 5 mmn), and 


(A.9.6.8) ff; Mys n; ...} mz 14-та, 2-1; mz gs %;...; my; n) 


, x m^ 257 : wu 
= f(x; My, ®...; m, NHL; my 3, 0; Myos 05 ...; Mos n), 


A-10 


е " y 
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BES ; n. 
where (m; 1-1) is supposed to be added to the (m; 4, т) on the right so as to reduce, 
the integral by one dimension. Using now (A.9.6.1)—(A.9.6.8) we have * 


Tu № Т... My, Т 
» 
(A.9.6.9) BY 2; 
Me, ya. o Myn ) 
«< — 
M NHI Min ... Myn 
1 j Et qr © 
== Bi) c m n+l т,» ... т, т 
m,--n--l 
„Бп es | 
Ma NHI т: ... MNF - 
> — 
My, WL mu. 2 mn 
> — 
1 В 1 M,N+1 3,0. ... ma, n 
: ; 
ms4-n--1l 
LI Tas 0. Myy 
%—1,% жул,» .. mm 
m, 


am MIA pads | ў se i À 


m,—l,n m,4," .. M,N 


where, in the second pseudo-determinant, [7] indicates that the corresponding terms in 
the formal expansion are not to be considered at all, [] being introduced merely to 
write the pseudo-determinant in a complete form. Recalling the notation (A.9.6.6)— 
(A.9.6.8) and the lemma (A.9.5) it is easy to see that 


о 


€ << 

Mn FL M ja... myn 
(A.9.6.10) p 25 Я 

«€ — 

Tw,n-Fl т, 1, т My: № 


The vs ЛИ 


T x ЗНА 


* 
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3 05-3, n 


M+M; 2n-+1 


TS a. 


» where 2,2; 


row of 
M, yn 
the (s—1)-fold pseudo-determinant fj 
m, 3, 1 
my as 
Thus (A.9.6.10) = (ж; m, n4-1) B | =; 
m, 4. 
ms 3: Th 
i-i 
+ X (—1)7 Bla; mm, 2n4-1) P, | 2; 
Tl 
3s n 


where f,, is the (s—2)-fold integral obtained by suppressing the i^. row and r 


ту, To 


..] is an (s—1)-fold pseudo-determinant 


(m, +m, 2n4-1) .... (m,4-m,4, 2n--1) for (m, 1, n), (m, 9, т), ++», (m,,n) in the 


ma. 


mtm, 2n.3- 1. 5 


obtained by substituting 
yh 


My, 


ту, № 


ту. № 


t column 


of the (s—1)-fold pseudo-determinant 2 already referred to. We have likewise 


> —3 
Mp NnF m,4" 
(A.9.6.11) Poa OH Brei rt 
m, NHI Mg% 
А [PET 
[ TS n 


my, 


m,--m, 4, 2n4-1 


Ea (—1)3 Bla; m,4-m,.,, 221) By | 35 


т=1 


m a.m 


» 
210 DISTRIBUTION FUNCTIONS OF SMALLEST AND LARGEST ROOTS 


Now substituting from (A.9.6.10) and (A.9.6.11) in the right side of. (А.9.6.9) - 


we have 


Ms, № Ta. № Ga My, № Ч 
(А.9.6.12) В| =; 
m,, N m, a. Na Ta. № 
б 
Ts ... mum 
a ш Blas nl) | а 
m,--n-4-1 
mayas e My, N 
My e. Ta 
à су (узд 2n-+1 
— —1yf-3 f(x; m, +m, 2n 2; 
m, 4-n4- 17 , (1 5—7; = ) Ё 2 
foa. s Myn 
m,—1,n maya. Ae ma, 
m, 
++ Ê] 5 
m, --n4-1 
m,—1, n [PR 2A ту, № 


Tt may be noticed that the left hand side is an s" order pseudo-determinant while, 
on the right hand side, the first A[ 2; ...] is an (s—1)* order pseudo determinant, 
the second group of terms involves f,,, each such fj, being an (s—2)"@ order pseudo- 
determinant, and the last term has a 2 which is an s-th order pseudo-determinant with 
the exponent m, reduced to n—1. It may be also noticed that f, may be con- 
veniently written as 


Meg 5% ove Teeri A. Hir i 0... Ma 10 


Toss б sse ео. Mer p ees My, 1 


(A.9.6.12) thus gives us a recurrence relation, whereby, proceeding along the chain 
and reducing m, to m, 4 (in which case the pseudo-determinant will be zero) we have 
the following reduction of the integral by one dimension. 


0 
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ENS 3 5 ma ss тт 
~ 
(A.9.6.13) l| z; 
Way ... M,N 
Man e Mp 
E 
. 
mish es Myn 
Mg — 5-1 
x E Bolz: m,—r' 4-1, п-1)(т,), 110 т 1) 
Wy as Rees Tp; Myy- M <e My, % 5 
8-1 Mg — ™Ms-1 
XS E Spl ae 
7—1 T'-l 
maps А Tune Magi N i Ty 


x f(x; m, J-m,.,—r' EL, m- 1)(mj)yaf (m, Ir 


sth order pseudo-determinant 


pseudo-determinants, and these 
seudo- 


where (m), stands for m(m—1)...(m—p-- 1). The 
is thus thrown back on (s—1)" and (s—2)'" order 
again on (5—2)" and (s—3) order ones and so on till we get to first order p 
determinants which are easily evaluated from the incomplete f-function tables. 


(A.9.7): Evaluation of the integral 


æ Ug EZ 25 
8 
| | E | ( МП da; = flat, eoim, m; ---, 1: nl (say) i 
i=l 
muUo Wai X0 ®2=Ш0 293—390 
Ms, 1 Туа. ... MiM 
my, Tuas ... Ta 
= X, To; (say) 
m, Mz aM e Myn 


where M stands for the determinant under the integration sign in (А.9.1). We shall 


also use the notation 


D 
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(A.9.7.1) ffc, ay; mmu mias Meigs 5 My, т) . 


a 25 d ] 


= | aande, fade, 


то Жо \ 


wa [21 E 
аео de, [anys 
20 жо j ; 
Proceeding now exactly as in sections (A.9.1), (A.9.2), (A.9.3), (A.9.4) and ; 
(A.9.5) with obvious modifications at each stage we have in place of (A.9.6.12) the 
following result 
Is Mg Ming, a Mih 
man Mg ys su, My, Њ 
(A.9.7.2) B 15,20; 
p may fos ee Thy, TV 
[КЕ main GE my, Г 
=p X45 
we [ERE E Mı n | 
Ms —™Ms-1 1 
X 3o от) [0,1-0 ДГА: mr, nH) *# (жу; т," +1, n+1)] : 
Tea) 
This sae Tong. Mg peg, us Mg, Te 
s-1 mma A Min oi Tuin. Meg, W ... My, 
Tene тауа Вр 
fal T-p | 
22) Тат... Mery h Tu a, ous My, 0 ЗЕ. 
xL Ue о аты bm, р 2n4-1) $ 
(КЫЛЫК Т Ue O E | 


where (m), = m(m—1) ... (m—p--1) and Во(2; m, n) stands for a"(l1—z(". The s-th 
order pseudo-determinant is thus thrown back on the (5—1)* and (s -2)'" order 
pseudo-determinants, and so on till we get to first order .pseudo-determinants which 
can be easily evaluated from the incomplete beta function tables, 
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